An Agreeable Procrastination – and the blog of Niels Kühnel

Code, the universe and everything

Archive for the ‘Uncategorized’ Category

An 101 on the Localization framework in Umbraco 5

with 7 comments

Yes! Umbraco 5 is out, and I remember talking with sir Alex Norcliffe about having one assembly even non-Umbraco projects could use to leverage some of its awesomeness.

That’s “Umbraco.Framework.dll”.

So, you can use Umbraco 5 localization in Umbraco 5 projects, and also in any other project. Just reference “Umbraco.Framework”.

In this blog post I’ll run through:

  • How to use localized texts in views and code
  • How to define your texts
  • MVC conventions (how to set texts for view model labels, validation errors etc. without attributes)
  • How ITextSource can be used to keep your texts in a Google Spreadsheet.
  • How to define custom languages and fallbacks

How to print texts

In views you import the namespace Umbraco.Framework and then you can write


<div>@Html.GetText("Hello")</div>

If you need parameters for your text you’ll write


<div>@HtmlGetText("Hello", new {UserName="Bob", NewMessages = 4, Weight=158})</div>

If you write the above and have defined a text “Hello” with the pattern


Hello {UserName}. You have #NewMessage{0: no new messages | 1: one new message | {#} new messages}. You weigh {Weight:N2} kg.

your view would print “Hello Bob. You have 4 new messages. You weigh 158.00 kg.”

In your controllers and other code the easiest way to localize is to import the namespace “Umbraco.Framework.Localization” and use the extension method on string called Localize. This method assumes that the string is a text key, so in code you would write:


var message = "Hello".Localize(this, new {UserName="Bob", NewMessages = 4, Weight=158})

You have to write “this” as the first parameter because texts are “namespaced”, but I’ll cover that in an even later blog post. (Basically, you can embed texts in assemblies and these texts won’t interfere with your other texts even if they have the same key. And you can also override assemblies texts with your own, it’s awesome! But not covered here).

Defining texts
If you’re in an Umbraco 5 project it’s pretty simple. You just define your texts in ~/App_Data/Umbraco/LocalizationEntries.xml and start coding. NOTE: You’ll have to add Namespace=”” to the root tag, so your file should like this:

 <?xml version="1.0" encoding="utf-8" ?>
 <Translations Namespace="">
 <Text Key="Welcome">
 <Translation Language="*">Hello {UserName}.</Translation>
 <Translation Language="da-DK">Hej {UserName}.</Translation>
 </Text>
 </Localization>

Note that if a text is missing, the framework will look for a text in the language “*”. This is the “default” language all other languages will fall back to if a text is missing. If your site’s default language is English you should define your English texts with “*”.

Standalone

If you’re not in an Umbraco 5 project you only have to add this to Application_Start in global.asax.cs:

LocalizationWebConfig.SetupDefaultManager<MvcApplication>();
LocalizationWebConfig.SetupMvcDefaults();

And then you’re ready to go.

Standalone, the path for LocalizationEntries.xml is ~/App_Config/LocalizationEntries.xml

MVC conventions

The localization framework provides a ModelMetaDataProvider and ModelBinder that automatically binds texts to view model labels and validation messages following the conventions described here.
Assume you have a view model like


public class FooModel
{
 [Required]
 public string Name {get;set;}
 [Range(0,250)]
 public int Age{get;set}
}

Properties
If you would like your views to print something other than “Name” and “Age” when you write Html.LabelFor(m=>m.Name) you simply define texts with the keys “FooModel.Name” and “FooModel.Age”.

You can also just define keys called “Name” and “Age”. In that case those would be used for all view models with those properties. Also, if you want to be really specific you can prefix the view model’s class name with its namespace.

The framework consider these keys in this order:

  1. Namespace.ClassName.PropertyName
  2. ClassName.PropertyName
  3. PropertyName

Validation messages
Validation messages are also localized, and opposite .NETs localization with resource strings the messages can use parameters.
You add keys named ClassName.PropertyName.[Name of validation attribute]. (Same patterns as above, so Name.Required will do and including the namespace is more specific)
For instance, if you add a key with the name FooModel.Name.Required and text “This is so required” this will be used if the Name field is left blank.

You can also add general validation texts for all properties by adding keys named “Validation.Required”, “Validation.Range” etc. If you define a text for “Validation.Required” all required fields will show this validation message if you have not defined specific texts for them.

The texts for validation messages can use the special parameter “Value” and all public properties of the validation attribute. For example, for [Range] you can define your text as “{Value} is not between {Minimum} and {Maximum}” and then you have a nice validation message that can easily be translated and state the limits (standard .NET resource texts are not great for that).

Text sources
If you don’t like the XML format or have texts from an external source you can create your own provider that implements ITextSource.
You’ll only have to implement the Get method that returns an IEnumerable of LocalizedText. When creating those you shouldn’t normally bother with more than Key, Language and Pattern. (The rest are “advanced topic material”)
In the demo project (see link at bottom) I have implemented a CSV file source and another one that reads texts from a Google Spreadsheet. If you have your own spreadsheet just click File->Publish and copy the link from the “Get a link to the published data” after you have picked CSV, and insert that link instead of mine. I (think I) have shared my spreadsheet so that you can read it.

Custom languages
The localization framework is not bound to CultureInfos. It is the standard though, so if you don’t specify anything else it will look up the language code from the current CultureInfo (e.g. “en-US”, “ar-SY”, “da-DK” etc.).
If you choose to define custom languages you can define “fallback chains” for languages so, for example, if a Danish text isn’t found a Norwegian one is better than English. You can see a simple example of how to do this in the demo project.

The CultureInfo used for formatting dates and numbers is inferred from the key. This means DO follow .NETs names (“en-US” etc.). You are allowed to add more dashes, so if you need a few texts different for Mac and PC users you can define “en-US-mac” and “en-US-pc”. Both will use “en-US” locale and fall back to “en-US” if no specific text is found. It’s a design choice because it’s easier if you stick with standardized language keys. If you really, really need something else you can subclass LanguageInfo and define your own logic (also “advanced topic”)

Demo

I’ve created a small demo solution for the purpose, and it can be found here: https://bitbucket.org/nielskuhnel/localization101
It demonstrates the concepts described in this post.

Happy localizing!

Written by niels.kuhnel

March 11, 2012 at 4:43 am

Posted in Uncategorized

The New Localization Framework in Umbraco 5

with 18 comments

This is a primer on the new localization framework in Umbraco 5.

Microsoft did a very fine job with System.Globalization when it comes to formatting numbers and dates for different locales. The localization framework in Umbraco adds to this support for grammatical differences between (spoken) languages including differences in plural forms, order of words in sentences etc.

Generally seen the framework consists of:

  • A replacement for resource strings that allows texts to be combined from a multitude of layered sources.
  • A superset of the string.Format syntax with a domain specific template language tailored for handling grammatical differences between languages.

The main objective is to separate grammatical logic from code and to maximize the length of text passages to be localized to give translators maximum context and flexibility. All this while minimizing the number of redundant texts.

Let’s look at a simple example to illustrate why you need this framework. Suppose you want to greet the user in some system with the number of new messages like “Welcome Fletcher. You have 5 new messages”. You quickly realize that this doesn’t work with only 1 message and take a simple approach with

string.Format(“Welcome {0}. You have {1} new message(s)”, name, count).

That works for English but it’s not suitable for localization because other languages may not support the “word + (plural ending)” form very well. Besides, you probably don’t want your fancy Web 2.0 site to print messages that looks like something from a DOS command prompt (y/n?).

Instead you might solve this with

string.Format(“Welcome {0}. You have {1} new {2}”, name, count, count == 1 ? “message” : “messages”)

or

string.Format(Get(“Greeting”), name, count, count == 1 ? Get(“MessageSingular”) : Get(“MessagePlural”)

(Assuming that you have a Get method to get resource strings)

That will work great for most Western languages. You may however have cut off the French because they use the singular form for zero too (They have 0 message). And it will definitely not work for Slavic languages because they have much more difficult rules for plural forms. See http://translate.sourceforge.net/wiki/l10n/pluralforms for reference (e.g. one of the Polish cases is n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20))

Not supporting a lot of exotic languages may be okay for your intended audience but the approach still has some disadvantages:

  1. You have now included language specific grammatical logic in your code. It’s very annoying to fix bugs related to this and it may be a never ending story as new languages are targeted.
  2. You have three different texts to translate for the same message and it’s not very clear in which context the atomic texts “message” and “messages” belong.
  3. You need to explain what {0}, {1} and {2} represents in the text and it becomes ghastly if you have even more parameters. That may entice you to split the text to reduce the number of parameters but then the translator will lose some of the flexibility.

In Umbraco 5 you’ll write

Localize(“Greeting”, new { Username = name, MessageCount = count }).

From a developer’s perspective that’s perfect because the code above very clearly expresses “I want to have the greeting text here, and I’ll pass these named parameters that it need for it”. Now, from a translator’s perspective this is also great because the framework’s entire pattern syntax is available for creating a translation without any compromises.

The English version for this would be

Welcome {Username}. You have #MessageCount{1: 1 new message | {#} new messages}

The first thing you’ll notice is that the named parameters are used instead of numbers. In code anonymous types are used to specify the parameters and in the text it’s clear what the parameters represent. In this example it’s not needed but you can use the normal format specifiers after the name so {MessageCount:N2} would become “5.00”.

The second thing is the “switch” construct that allows you to use different texts for different counts. It has the syntax

#[ParameterName] { [Condition 1] : Text 1 | [Condition 2] : Text 2| Text in other cases}.

Within the switch body the special parameter {#} means the value of the parameter being switched on.

This should open some opportunities as you, without changing the code in your application, could make more interesting texts by changing it to

You have #MessageCount{0: no new messages | 1: one new message | < 10: {#} new messages | a lot of messages!}

There’s a pretty extensive syntax for the switch conditions and all the plural rules from plural form reference link above are supported.

Now, you may argue that translators don’t like writing curly braces but the syntax is just what the framework expects. Feel free to make your own easier-to-understand intermediate format for the translator or even create a graphical editor. The point is that grammatical logic is effectively removed from your application’s code.

(By the way, the framework is not locked to this syntax. It’s just the default parser. The framework works on ASTs and you can create your own grammar and parse that into these ASTs instead.)

Now let’s add one final thing to the example. Say we want some HTML in the text as we want it to be

“Welcome <span class=’user-name’>Fletcher</span>. You have <span>5 new messages</span>”

With the string.Format approach you may either have to chop the text into tiny pieces or accept that the translations include markup. Neither is desirable. The former approach creates an immense number of texts with very unclear purposes and the latter makes you tired if you decide to change the markup after the software has been translated to 20 languages.

With the Umbraco localization framework you can write

Welcome <NameFormat: {Username}>. You have <MessageFormat: #MessageCount{…} >

And then dynamically specify the markup with

Localize(“Greeting”, new { Username = name, MessageCount = count, NameFormat = “<span class=’user-name’>{#}</span>”, MessageFormat=”<span>{#}</span>” }).

This gives at least the advantages that 1) you don’t have to split the text so the translator still has full context and flexibility and 2) the translated text is not tied to HTML and you could, in principle, use it on other devices because it just contains “format markers”.

With the default settings the parameter values are HTML encoded but the format you specify is not, so little Bobby <XSS would be greeted with “Welcome <span class=’user-name’>Bobby &lt;XSS</span>”

In a later blog posts I’ll dive deeper into the syntax and its features that include reusable templates, switching on timespans, roman numbers and much more (you can see an up to date’ish specification of the grammar here), but my next post will be about

Text sources

My next blog post will be about how text sources are structured, the default XML format and how new sources can be implemented and embedded in assembly manifests. One of the main benefits over ordinary resources strings is that even if texts are embedded in assemblies, texts for other languages can be added from XML files, databases etc. These other sources can also replace/correct texts in existing languages. You’ll also see how texts can be arranged in namespaces to avoid clashes and how properties of MVC view models are automatically mapped to text keys without the use of attributes.

Even if you don’t expect your application to be translated to other languages you can still benefit from the framework as it greatly helps you maintain your texts without hacking your code.

Rembember: “Language is vivid. Don’t let computer languages keep it down!”

Written by niels.kuhnel

May 12, 2011 at 3:04 am

Posted in Uncategorized

Paging entities with eager fetched collections efficiently in NHibernate

with 7 comments

UPDATE: For almost any practical purpose batch loading is faster. The exotic cases are not worth the effort. This post is only relevant if you are obsessed with ORM performance tuning and want inspiration from some crazy ideas.

Last time I blogged (it’s been a while) I shared some ideas about how unrelated one and many to many relations could be loaded eagerly without the overhead of the Cartesian explosion. It turned out that this could be implemented in NHibernate by overriding the ANSIJoinFragment and wiring it from a custom dialect (see link at end).
Now, another problem arises when you want to paginate the results from these queries. When you say “give me results 10 to 20” NHibernate literally does that as it hydrates rows 10 to 20 from the result set. This gives you less than 10 entities with incomplete collections. At least this happens with SQL server, and all of the following is related to this issue.
What you really want is the 10th to 20th root entities of you query.

Assume that we have a very simple situation with some Persons (ID, Name, Tags) and Tags (ID, Name) and does a query like this

var persons = session.QueryOver().OrderBy(x => x.Name).Asc.Skip(5).Take(10)
                    .Fetch(x => x.Tags).Eager
                    .List();

Let’s consider the generated SQL and see where it goes wrong

SELECT TOP (@p0)
  ID0_1_, Name0_1_, Person3_3_, ID3_, ID5_0_, Name5_0_ FROM
    (SELECT [ID0_1_, Name0_1_...], ROW_NUMBER() OVER(ORDER BY this_.Name) as __hibernate_sort_row FROM [Person] this_left outer join [Tag] tags2_ ON this_.ID = tags2_.Person_id) as query
    WHERE query.__hibernate_sort_row > @p1
    ORDER BY query.__hibernate_sort_row;
@p0 = 10 [Type: Int32 (0)], @p1 = 5 [Type: Int32 (0)]

The problem is that ROW_NUMBER() is used only for offsetting and the good old SELECT TOP … is used for limiting. In the current form
neither of these take into account that multiple rows for the same root (here person) should only be counted once.
If we remove TOP and only uses the row number we get the following query that still doesn’t work, as row numbers are unique:

SELECT [ID0_1_, ... ] FROM
    (SELECT [ID0_1_, ... ], ROW_NUMBER() OVER(ORDER BY this_.Name) as __hibernate_sort_row FROM [Person] this_left outer join [Tag] tags2_ ON this_.ID = tags2_.Person_id) as query
    WHERE query.__hibernate_sort_row > @p1  AND query.__hibernate_sort_row     ORDER BY query.__hibernate_sort_row;
@p0 = 10 [Type: Int32 (0)], @p1 = 5 [Type: Int32 (0)]

If RANK is used instead of ROW_NUMBER we actually get what we want but it’s a.) very inefficient with joins as the server has to join all the records of the tables before it can say anything about rank, b.) too easy 🙂

What we really want is to do a sub query that confines the root entities we want and then join the eager load tables on only those. This is very close to the queries that would normally arise from lazy loading, except that the database server does it all at once as fast as it can.
If we consider this general query structure

SELECT {RootFields} (Other fields...) FROM {Root table} {Root alias} (Some joins and what not...) {Where} ORDER BY {Order by}

it must be made into this

SELECT {RootFields} (Other fields...) FROM (
   SELECT __n__, {RootFields} (Other fields...) FROM (
     SELECT ROW_NUMBER() OVER ({Order by}) __n__, {RootFields} FROM {Root table} {Root alias} {Where}
     ) {Main alias}
     (Some joins and what not...)
    ) query WHERE __n__ BETWEEN @Offset AND @Offset + @Limit
ORDER BY __n__

This puts some restrictions on the where clause. As it is shifted into the root entity query it can’t
consider fields from the other joins, but as the joins are assumed to be for eager loading they shouldn’t be filtered in the first place. If you really want to filter the root entities on their relations EXISTS queries should be used instead.

In my implementation I give the option to toggle the behavior by adding EagerMsSqlDialect.VeryBigNumber to the limit. Default is off,
so the number must be added. This makes it explicit that a dialect specific feature is used.
Curiously, a big number is actually needed as NHibernate cuts off in the hydration process after “limit” records has been processed so the extra records returned by the query wouldn’t be considered otherwise. I do prefer the former reason for VeryBigNumber though 🙂

Putting it all together
So with the eager fork joins and improved paging you can write code like this

var q = session.QueryOver().OrderBy(x => x.Name).Asc
                    .Skip(5)
// Add EagerMsSqlDialect.VeryBigNumber to limit to use the improved paging
                    .Take(EagerMsSqlDialect.VeryBigNumber + 10)
                    .Fetch(x => x.Tags).Eager
                    .Fetch(x => x.Skills).Eager
                    .Fetch(x => x.Phonenumbers).Eager
                    .Fetch(x=>x.Skills.First().Tags).Eager
                    .Fetch(x => x.Skills.First().Groups).Eager
                    .TransformUsing(Transformers.DistinctRootEntity)
                    .List();

Here

  1. only one query is sent to the database
  2. no redundant rows are returned (i.e. no Cartesian explosion)
  3. root entities 5 to 10 are returned
  4. the collections asked for are initialized so no lazy loads are used later

You can find my proof of concept code at https://bitbucket.org/nielskuhnel/eager-nh-extensions together with a small console example.
Unfortunately it doesn’t work with NHibernate.Linq, as a) the current dialect’s JoinFragment isn’t used when joins are constructed b) Take and Limit are discarded when FetchMany is used. Luckily it works like a charm with the new QueryOver API.

I would really love to hear if it works for you, and if it improves performance in code you have already written.

Written by niels.kuhnel

February 13, 2011 at 5:49 pm

Posted in Uncategorized

Tagged with ,

How to use TortoiseHG as a client for Assembla’s Source/Git repository

leave a comment »

After a little struggle I have managed to use TortoiseHG as a client for an Assembla Git respository. The biggest problem was to get ssh working with the hg-git extension.

It’s really great because now I can use Mercurial (as I prefer) with Assembla. And with hg-git, the integration with Git repositories is seamless, so I will never have to think about that I’m actually sleeping with the enemy.

So, what you need to do to enjoy the same hapiness as I am is this:

1. Generate a public/private key pair to use with Assembla

Download PuTTy. You can get it from here. Choose to download all the binaries (either as zip or installer).

Use “PUTTYGEN.EXE” to create the key pair and update your profile on Assembla as described at http://www.barebonescoder.com/2010/04/assembla-git-windows-you/

Note the stuff about pageant.exe. On my computer I had to start it manually (it’s placed where you installed PuTTY). It’s pretty easy to make it start automatically with Windows. Just add a shortcut in the folder “Startup” in your start menu.

When it’s running you have to add the private key you’ve just created:

Simply click the pageant icon in the system tray (it’s a computer wearing a fedora) and choose “Add Key”. Then select your key and enter your passphrase

pageant

The pageant icon in the sytem tray on Windows 7

2. Install the hg-git extension

(from http://tortoisehg.bitbucket.org/manual/1.0/nonhg.html)

TortoiseHg Windows installers come with the python-git bindings (named dulwich) that hg-git requires, so one only needs to clone the hg-git repository to your local computer:

hg clone http://bitbucket.org/durin42/hg-git/ C:\hg-git

Then enable hggit and bookmarks in your Mercurial.ini file:

[extensions]
bookmarks =
hggit = C:\hg-git\hggit

You can verify that worked by typing hg help hggit

3. Make a copy of TortoisePlink.exe and call it “ssh.exe”

I took me a while to figure this one out. Luckily, this answer worked. Otherwise you get weird “abort: The system cannot find the file specified” errors when trying to clone the repository.

It’s easy: Simply make a copy of “C:\Program Files (x86)\TortoiseHg\TortoisePLink.exe” (or wherever you have installed TortoiseHG) and call it “ssh.exe”.

4. Use the right url for the repository. That’s not the one from Assembla.

Assembla will tell you to use an url like “git@git.assembla.com:your-space-name.git”. That doesn’t work. Instead you should use “hg clone git+ssh://git@git.assembla.com/your-space-name.git”.

5. Enjoy

That’s it. Now you can use TortoiseHG as you’re used to while pushing and pulling changes from Assembla. Neat, huh?

Written by niels.kuhnel

September 1, 2010 at 5:37 pm

Posted in Uncategorized

Tagged with ,

Best Chrome Theme Ever

with one comment

I’ve just changed my Chrome theme to “Candies”. It really adds a feeling of Cillit Bang® effectiveness to my browser experience.

Chrome theme

The “Candies” theme in Chrome

The only problem is the pin-up girl that it adds to your Chrome home page. So it’s not a good idea to use the theme when you show people stuff on your computer unless you want them to think that you’re a porn enthusiast.

Chrome theme 2

You will show the audience this girl if you don’t watch out.

In short: 1) Use the theme. 2) Avoid situations like this:

Conference Attendee 1: “Did you see the presentation about that new CRM system?”
Conference Attendee 2: “You mean the one by Onan the Babarian?”
Conference Attendee 1: “Yes.”

Written by niels.kuhnel

August 31, 2010 at 2:08 pm

Posted in Uncategorized

How to use the cool, new OpenType features in Word 2010

with one comment

It’s really nice to see that Microsoft finally provides support for the standard they defined themselves only 14 years ago. The advanced typography features of OpenType, now enabled in Word 2010, makes your documents look so much more professional and aesthetically pleasing. Especially when they’re combined with features already present in earlier versions of Word.

However, all this is disabled by default. Here’s how to enable it:

1. Change the “normal” style to enable the features everywhere:

OpenType features in Word2010 - Change font settings, small

2. Enable hyphenation

Hyphenation 

And now, look what you’ve got: Kerning (must have), ligatures (must have – special characters for certain character “collisions” like fi, ffi) and old-style figures (optional, but looks really nice and fancy ad agency’ish).

OpenType features in Word 2010

It looks like the everyday typophile can finally retire LyX + XeLaTeX. XeLaTeX’ll probably have some sophisticated features, still not present in Word, but in comparison Word documents are extremely convenient as they can be emailed to and fro for comments and collaboration, without telling people how to install MikTeX, LyX and follow various intricate guides to set it up right.

One more thing… You probably want to create a PDF when you’re nicely looking document is done. And you have probably used Type 1 flavoured OpenType fonts (the ones form Adobe that have so-called CFF outlines), and then you find that your PDF looks miserable and text can’t be selected when you use Word’s build-in export function. That’s because Word doesn’t support OpenType fonts with CFF outlines.  Luckily a free remedy (soon again) exists if you want to have both automatically generated table of contents AND OpenType fonts with CFF outlines. The PDF-T-Maker. For some reason it doesn’t currently work with Word 2010. I have created a patch and sent it to them. To avoid licensing issues I will not publish it here, though.

Written by niels.kuhnel

August 26, 2010 at 4:47 pm

So it came to this…

leave a comment »

I’ve finally overcome my animosity towards blogging. This is it! (ta-da) My blog…

Written by niels.kuhnel

August 11, 2010 at 9:59 pm

Posted in Uncategorized