Simple Content Synchronization with Umbraco

In this post, I'll share how I tackled a small project requiring the import of content from an old site or files into an Umbraco CMS. While there are paid options available for such tasks, my project's budget constraints led me to develop a basic, yet effective, logic for content importation. I aim to outline this logic, offering it as a starting point or inspiration for others. Please note, the code I'm presenting isn't the pinnacle of optimization, but it's a solid foundation you can build upon. This approach is designed for a single language setup, though adapting it for multilingual content is straightforward. Let's dive into the C# code provided and explore how it works, step by step.

String Extensions

The StringExtensions class contains two extension methods:

  1. GetDeterministicHashCode: This method calculates a hash code for a string in a deterministic manner, ensuring consistent results across different executions.
  2. CamelCase: Converts the first character of a string to lowercase.

 

public static class StringExtensions
{
	public static int GetDeterministicHashCode(this string str)
	{
		unchecked
		{
			int hash1 = (5381 << 16) + 5381;
			int hash2 = hash1;

			for (int i = 0; i < str.Length; i += 2)
			{
				hash1 = (hash1 << 5) + hash1 ^ str[i];
				if (i == str.Length - 1)
					break;
				hash2 = (hash2 << 5) + hash2 ^ str[i + 1];
			}

			return hash1 + hash2 * 1566083941;
		}
	}

	public static string CamelCase(this string str)
	{
		if (string.IsNullOrEmpty(str))
		{
			return str;
		}

		return char.ToLowerInvariant(str[0]) + str.Substring(1);
	}
}

Custom Attribute: UmbracoAliasAttribute

This attribute is applied to properties in your source model for content and is used to specify an alias for Umbraco content properties. It will apply camelcase for the property names because Umbraco's default way in is to camelcase everything. 

[AttributeUsage(AttributeTargets.Property)]
public class UmbracoAliasAttribute : Attribute
{
	public UmbracoAliasAttribute(string alias)
	{
		Alias = alias.CamelCase();
	}

	public string Alias { get; }
}

Interface: IImportSource

This interface defines properties and methods that importing sources must implement. It includes properties for a unique identifier and a name, along with a method to calculate a unique hash for an import source.

public interface IImportSource
{
	public string UniqueIdentifier { get; }

	public string Name { get; }

	public int GetUniqueHash()
	{
		return System.Text.Json.JsonSerializer.Serialize(this).GetDeterministicHashCode();
	}
}

Class: SynchronizableContentType

Represents a synchronizable content type with properties like ContentTypeAlias, UniqueIdentifierPropertyName, and HashPropertyName.

This class is the binding factor for generating the correct contenttype and also where you define the correct properties to use for the unique identifier (to bind content and an external source) and a hash where we can store the last hash code of the object. 

public class SynchronizableContentType
{
	public SynchronizableContentType(string contentTypeAlias, string uniqueIdentifierPropery, string hashPropertyName)
	{
		ContentTypeAlias = contentTypeAlias;
		UniqueIdentifierPropertyName = uniqueIdentifierPropery;
		HashPropertyName = hashPropertyName;
	}

	public string ContentTypeAlias { get; }

	public string UniqueIdentifierPropertyName { get; }

	public string HashPropertyName { get; }
}

Abstract Class: ContentSynchronizer

An abstract class responsible for synchronizing content. It includes methods to fetch items for import, run the synchronization process, create new content, update existing content, and clean up stale content.

How It Works

  1. Initialization: The ContentSynchronizer class is initialized with an IContentService, a SynchronizableContentType, and a content root ID.
    Content Import:
  2. The RunAsync method fetches items to import and existing content from Umbraco.
  3. Comparison: For each imported item, the method compares the existing content with the imported data using a unique hash.
  4. Updating Content: If the content differs, it updates the existing content with the imported data and saves it.
  5. Cleanup: After synchronization, the method removes stale content not present in the import data.
    Conclusion
public abstract class ContentSynchronizer<TImportSource>
where TImportSource : IImportSource
{
	private const int BatchSize = 50;

	private readonly IContentService _contentService;
	private readonly SynchronizableContentType _contentType;
	private readonly int _contentRootId;

	protected ContentSynchronizer(IContentService contentService, SynchronizableContentType contentType, int contentRootId)
	{
		_contentService = contentService;
		_contentType = contentType;
		_contentRootId = contentRootId;
	}

	public abstract Task<IList<TImportSource>> GetItemsToImportAsync();

	public async Task RunAsync()
	{
		var importItems = await GetItemsToImportAsync();
		var children = GetChildrenFromCms();

		foreach (var importItem in importItems)
		{
			children.TryGetValue(importItem.UniqueIdentifier, out var content);
			content ??= CreateNewContent(importItem);

			if (IsContentChanged(content, importItem))
			{
				UpdateContent(content, importItem);
				_contentService.SaveAndPublish(content);
			}
		}

		Cleanup(importItems);
	}

	private IContent CreateNewContent(IImportSource importItem)
	{
		return _contentService.Create(importItem.Name, _contentRootId, _contentType.ContentTypeAlias);
	}

	private bool IsContentChanged(IContent currentContent, IImportSource source)
	{
		var hash = source.GetUniqueHash();
		return currentContent.GetValue<int>(_contentType.HashPropertyName) != hash;
	}

	private void UpdateContent(IContent currentContent, IImportSource source)
	{
		if (source == null) throw new ArgumentNullException(nameof(source));
		// always set hash and unique identifier
		currentContent.SetValue(_contentType.HashPropertyName, source.GetUniqueHash().ToString());
		currentContent.SetValue(_contentType.UniqueIdentifierPropertyName, source.UniqueIdentifier);

		// Get the type of the object
		var type = source.GetType();

		// Get all public properties of the object
		var properties = type.GetProperties(BindingFlags.Public | BindingFlags.Instance);

		foreach (var property in properties)
		{
			var propertyName = property.Name.CamelCase();
			var attribute = property.GetCustomAttribute<UmbracoAliasAttribute>();
			if (attribute != null)
			{
				propertyName = attribute.Alias;
			}

			// Validate if property exists in the content type and is not the unique identifier poperty name
			if (!currentContent.HasProperty(propertyName))
				continue;

			if (propertyName.Equals(_contentType.UniqueIdentifierPropertyName))
				continue;

			if (property.PropertyType == typeof(string) || property.PropertyType == typeof(int))
			{
				// Get the value of the property
				var value = property.GetValue(source, null)?.ToString();

				if (property.PropertyType == typeof(int))
				{
					if (value != null) currentContent.SetValue(propertyName, int.Parse(value));
				}
				else
				{
					currentContent.SetValue(propertyName, value);
				}
			}
		}
	}

	private void Cleanup(IEnumerable<TImportSource> importItems)
	{
		var children = GetChildrenFromCms();
		var itemsToDelete = children.Keys.Except(importItems.Select(i => i.UniqueIdentifier));
		var itemsToDeleteContent = itemsToDelete.Select(i => children[i]);
		foreach (var content in itemsToDeleteContent)
		{
			_contentService.Delete(content);
		}
	}

	private Dictionary<string, IContent> GetChildrenFromCms()
	{
		var result = new Dictionary<string, IContent>();
		long processed = 0;
		var page = 0;
		_contentService.GetPagedDescendants(_contentRootId, 0, 1, out var totalRecords);
		while (processed < totalRecords)
		{
			var children = _contentService.GetPagedChildren(_contentRootId, page, BatchSize, out var _).ToList();
			processed += BatchSize;
			page++;
			foreach (var descendant in children)
			{
				result.Add(GetUniqueId(descendant), descendant);
			}
		}

		return result;
	}

	private string GetUniqueId(IContent content)
	{
		return content.GetValue<string>(_contentType.UniqueIdentifierPropertyName)!;
	}
}

So how about a real example of an implementation?

First, we need to have a source object. This object is based on JSON items so it is straightforward. 

It implements the IImportSource interface where you need to define 2 required properties 

  1. UniqueIdentifier will be used to determine the key between the CMS content item and the source. In our case a License Plate because that's unique.
  2. Name, this will be the name within the CMS. We choose to use the license plate but could also be a combination of model & type of car for example. 
public class ClassicCarSourceItem : IImportSource
{
	// Defines a property for the car's image, stored in a JSON property named "Image".
	[JsonPropertyName("Image")]
	[UmbracoAlias(nameof(Models.ClassicCarRegisterItem.Image))]
	public string OriginalImage { get; set; }

	// Stores the car's license plate, color, year, and model from the JSON.
	[JsonPropertyName("LicensePlate")]
	[UmbracoAlias(nameof(Models.ClassicCarRegisterItem.LicensePlate))]
	public string LicensePlate { get; set; }

	[JsonPropertyName("Color")]
	[UmbracoAlias(nameof(Models.ClassicCarRegisterItem.Color))]
	public string Color { get; set; }

	[JsonPropertyName("Year")]
	[UmbracoAlias(nameof(Models.ClassicCarRegisterItem.Year))]
	public string Year { get; set; }

	[JsonPropertyName("Model")]
	[UmbracoAlias(nameof(Models.ClassicCarRegisterItem.CarModel))]
	public string Model { get; set; }

	[JsonPropertyName("Type")]
	[UmbracoAlias(nameof(Models.ClassicCarRegisterItem.CarType))]
	public string CarType { get; set; }


	// These are helper properties for identifying and naming items.
	public string UniqueIdentifier => LicensePlate;
	public string Name => LicensePlate;
}

This ClassicCarSynchronizer is an implementation of a content synchronizer specifically above. for importing classic car register items into the cms. It extends the generic ContentSynchronizer class by specifying ClassicCarRegisterItem as its type.

Here's a breakdown of what the class does:

  • Initialization: The constructor of ClassicCarSynchronizer takes an IWebHostEnvironment, an IContentService, and an integer contentRootId as parameters. The contentRootId is the parent node it where the items will be created or updated beneath. 

  • Content Type Specification: In the constructor, a SynchronizableContentType is created and passed to the base class. This specifies the model type alias for the ClassicCarRegisterItem and the properties to be used for synchronization, namely the license plate and hash values. 

  • Data Import: The GetItemsToImportAsync method overrides the base class method to asynchronously read and import classic car data. This is only needed for importing the classic car JSON into the CMS using content types.

You now can build simple import synchronizes only defining which content type it needs to be with 2 required properties and a function to get the items from an external source and force it into a simple flat structured object.

public class ClassicCarSynchronizer : ContentSynchronizer<ClassicCarRegisterItem>
{
	private readonly IWebHostEnvironment _webHostEnvironment;

	// Constructor initializes the synchronizer with Umbraco content service and web environment details.
	public ClassicCarSynchronizer(IWebHostEnvironment webHostEnvironment, IContentService contentService, int contentRootId)
		: base(contentService, 
			new SynchronizableContentType(Models.ClassicCarRegisterItem.ModelTypeAlias, 
				nameof(Models.ClassicCarRegisterItem.LicensePlate).CamelCase(), 
				nameof(Models.ClassicCarRegisterItem.Hash).CamelCase()), 
			contentRootId)
	{
		_webHostEnvironment = webHostEnvironment;
	}

	// Asynchronously reads the first 10 items from a JSON file named "mantaregister.json" to import.
	public override async Task<IList<ClassicCarRegisterItem>> GetItemsToImportAsync()
	{
		var filePath = Path.Join(_webHostEnvironment.ContentRootPath, "classic-cars.json");
		var cars = await System.Text.Json.JsonSerializer.Deserialize<IList<ClassicCarRegisterItem>>(await File.ReadAllTextAsync(filePath));
		return cars;
	}
}

So with this basic library, I can easily set up multiple content synchronizations from different sources to Umbraco. 

The provided code block offers a simple foundation for implementing content synchronization in Umbraco-based projects, demonstrating effective utilization of Umbraco APIs and C# language features.

If you have any questions or remarks don't hesitate to contact me.Â