The depths of the bucket

We have always had an unwritten (it might be written somewhere, who knows) rule with Sitecore.

There should be no more than 100 sibling items.

The number 100 is a bit arbitrary, you can have 150, 200 or as many as you want. The are two valid reasons behind this best practice though:

  • Sitecore exposes its content to the user in a tree (at least in the Content Editor). A tree with a very large number of children items under one node is not very usable.
  • The data provider retrieving items from the database, pre-fetches children. So if I have large number of children items, retrieving the parent item will become quite a slow operation.

That is precisely why, up until Sitecore 7 if you had a large quantity of items, you would store them in a structure that ensured a safe maximum number of siblings, trading large number of children for a deeper tree.

But after Sitecore 7 things changed; we now have Item Buckets to help us deal with large trees.

The reality is that not much has changed. Item buckets does simply two things:

  • Automatically classifies items into a deeper tree to ensure there are not too many siblings
  • Hides all those items so they are not polluting the tree.

Since the items are not shown, authors are forced to use search to locate them.

Creating a bucket

All you need to do is use the Bucket button in the configure tab. Just remember only an admin or users that belong to the Sitecore Client Bucket Management role will be able to create buckets.
You are also able to make a Standard Values item as a bucket. This is useful if you want authors to create new items that are already buckets by default (they won't have to be members of the Bucket Management role).

You will notice that after creating a bucket not all children items disappear. In fact if you care to try this, you will see none of the items vanish. This is because only items that have been marked as bucketable will be swallowed into the bucket. Any item can become bucketable, and the easiest way to do it is use the options in the ribbon.

ribbon bucket options

In most cases you won't mark individual items as bucketable, but you will apply this to the standard values so it becomes a default value for newly created items of a certain type.

After making changes to the bucketable property of any descendants of a bucket item, you will need to ask Sitecore to reclassify all the items by syncing the bucket.

If you are curious about what happened to our items, you can tick the Buckets checkbox in the View tab to stop hiding them (assuming you have permission to access the View tab!).

bucketed items

Notice that Sitecore used the creation date and time of each item to put them inside folders. Here you have one initial problem with buckets. Notice all my items have ended up in the same folder, because I created them at the same time. If you think this is unlikely, think again. Let's say you have an importer that creates items. It churns 3000 items per minute... you see where I am going here.

So you need to think about how you want Sitecore to create those folders. There is another reason to think about those folders. In this case those items are just used for taxonomy but what if they were actual pages? What if they were let's say News Articles? What would their URL be?

Yes, it will be the full path, as usual. This may or may not enrage your SEO guys, but in any case it is something you have to keep in mind too.

Changing the folders

In older versions of Sitecore it was a bit difficult to change the folder structure of buckets. However nowadays it is very easy to do. You use the /sitecore/system/Settings/Buckets/Item Buckets Settings to define rules that dictate how those folders should be created.

You have a few conditions to identify a particular bucket and then you use actions to specify how the folders should be created. The conditions are a bit limited, since they did not choose to re-use the existing collection of conditions.

The most useful condition is where the ID of the Item Bucket compares to value. Except it forces me to copy the ID of the bucket; would not it be easier to just pick it from a tree? Also it asks to select a string operator; I don't see myself using an "ends with", or "starts with" or even "contains", let alone "regex expression" for IDs.
This condition should simply be "where the item bucket is a particular item". You pick the item from the tree.

This is very easy to implement. I don't want to digress too much here but, create a class as follows:

using Sitecore.Buckets.Rules.Bucketing;  
using Sitecore.Rules.Conditions;  
namespace Buckets.Rules  
{
     public class WhenBucketIs<T> : WhenCondition<T> where T: BucketingRuleContext
     {
          public string Value { get; set; }       
          protected override bool Execute(T ruleContext)
          {
               return ruleContext.BucketItemId.ToString() == Value;
          }
     }
}

and change the /sitecore/system/Settings/Rules/Definitions/Elements/Bucketing/Bucket Item Id item and fill the fields as follows:

Text: where the item bucket is [value,Tree,,a particular item]
Type: Buckets.Rules.WhenBucketIs,Buckets

There is also another condition that allows you to inspect not the bucket, but the item that is being bucketed. This is great as it allows you to have different folder strategies within the same bucket.

You have two main folder creation strategies. The one using the creation date, where you can specify, as a .NET formatting string, the structure of the folders, e.g. "yyyy/MM/dd" to only have a three-level structure with year, month and day.

The other is using the first N characters from the ID. This is the one I would choose if I don't care about the URLs and I am using an importer that creates items in a short span of time as it guarantees a random uniform distribution between the folders.

You can also use the first N characters of the name of the item, but I fail to see how that would be useful; the URLs are not great, and it does not guarantee a random distribution.

Relationships matter

When you make something into a bucket, all bucketable descendants are put into their corresponding folders according to the strategy selected. This means you no longer have a parent/child relationship. Sometimes this is not ideal, so you can change this behaviour by enabling the Lock Child Relationship Standard Field.

Lock Child Relationship

Remember you would check this on the parent item you want to keep its children when it goes into the bucket.

Whilst we are discussing Standard Fields, there are a few other options hidden in there. In particular you can define default queries, so pre-populating the search interface when an author selects the bucket (Default Bucket Query field). And you can also define a fixed part of the query, which authors cannot remove (Persistent Bucket Query). This last feature is a moot point though as filters are not ANDed but ORed. If the author adds a wildcard search they will get every descendant, independently of what the fixed query says.

This should not be here

Other Settings

By default empty folders are not deleted. A folder may become empty when you delete items inside the bucket. If you have a bucket with a lot of item churning this could become a slight issue. You could enable a schedule agent that would look for empty folders and remove them.

<agent type="Sitecore.Buckets.Tasks.RemoveEmptyBucketFolders" method="Run" interval="00:00:10"><DatabaseName>master</DatabaseName></agent>  

You could also change the template used to create the folders. I am not sure why anybody would require that. If you do, be careful you also exclude it from the index (that's the trick to make sure it does not show in the search results).

Summary

Item buckets is a useful feature. Just beware that it does not change how Sitecore deals with data, simply hides things away, and ensures they are classified in a deeper tree. Don't forget this is going to impact your URLs, plus also how you access those items (through the API for example, you will be better off using Content Search).

If you want to use items in a bucket as the source of a multilist field... well you shouldn't. Don't despair, Search-enabled fields are coming to the rescue. That'll be the next post.