devLux

A Data Generator for Mocking Objects

Whether in front-end or in back-end development, in any project there is the need to generate test data and to fake certain methods (i.e. to isolate from other assemblies) to unit test single components. This article provides a small and simple implementation of a data generator that allows assembling mock objects programmatically. But the article will not dive into faking methods.

Introduction

As we would do in a Test Driven Development (TDD) approach, we will first demonstrate how to use such a data generator. Afterwards we show how to implement a solution in a very few lines of code.

But first we start by introducing a sample implementation of the repository pattern, that encapsulates the data access of your entities. Due to the encapsulation aspect, this pattern allows to easily exchange a repository with a fake object. That is the place where the data generator comes into play: the generator is responsible for setting up the properties of the entity mocks.

Repository Pattern

Per Martin Fowler’s definition of the Repository Pattern, [The repository] mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.. In other words, the repository provides means for accessing, updating, adding, and deleting entities through the same object.

Our repository will be served by implementing the following interface:

public interface IRepository<TEntity>
{
  IEnumerable<TEntity> GetAll();
  void AddItem(TEntity entity);
  void RemoveItem(TEntity entity);
  bool ContainsItem(TEntity entity);
  TEntity Find(Func<TEntity, bool> expr);
  IQueryable<TEntity> FindAll(Func<TEntity, bool> expr);
}

In this article we only need to build a repository around an in-memory collection of entities. It is self-speaking that in a real-world situation you will need to implement the data store agnostic parts into your repository. In a data-driven application you will also very likely implement the Unit of Work Pattern around your repository. We will discuss this topic in a later article.

Since we are using only in-memory collections, all of the IRepository methods can be extracted into a generic base class. We end up with the following class, that can be extended by custom repository implementations.

public class Repository<TEntity> : IRepository<TEntity>
{
  private static readonly object _syncRoot = new object();
  protected object SyncRoot
  {
    get { return _syncRoot; }
  }

  private ICollection<TEntity> _items = new HashSet<TEntity>();
  public IEnumerable<TEntity> GetAll()
  {
    return new List<TEntity>(_items);
  }

  public void AddItem(TEntity entity)
  {
    lock (SyncRoot)
    {
      _items.Add(entity);
    }
  }

  public void RemoveItem(TEntity entity)
  {
    lock (SyncRoot)
    {
      _items.Remove(entity);
    }
  }

  public bool ContainsItem(TEntity entity)
  {
    return _items.Contains(entity);
  }

  public TEntity Find(Func<TEntity, bool> expr)
  {
    return _items.Where(expr).SingleOrDefault();
  }

  public IQueryable<TEntity> FindAll(Func<TEntity, bool> expr)
  {
    return _items.Where(expr).AsQueryable();
  }
}

An example of such a custom repository implementation is a repository for blog posts. For demonstration purposes the entity is kept very simple.

public class BlogPost
{
  public int Id { get; set; }
  public string Title { get; set; }
  public string Body { get; set; }
  public DateTime CreatedDate { get; set; }
  public DateTime? ModifiedDate { get; set; }
}

The repository providing access to the blog posts extends the abstract base class RepositoryBase.

public class BlogPostRepository : Repository<BlogPost>
{ }

A typical consumer implementation of the above repository, e.g. a web service or a controller, could look like as follows:

public class BlogPostService : Service
{
  private readonly IRepository<BlogPost> _repository;

  public BlogPostService(IRepository<BlogPost> repository)
  {
    _repository = repository;
  }

  public object Get(BlogPostRequest request)
  {
   throw new NotImplementedException();
  }
}

Please note that we are referring to the interface IRepository<BlogPost> and not to the specific class of our blog post repository as later this allows us to replace the provided repository instance with a fake one, e.g. by simply passing a different implementation as constructor parameter or more sophisticated through Dependency Injection (DI).

As we have setup an infrastructure for accessing our entities, we will now describe how to easily provide fake blog posts for testing purposes.

Mocking the Repository

With the sample blog post repository we are able to provide an alternate repository that implements the same interface but returns fake objects instead of the real ones. Randomly between 30 and 100 items are generated, added to the repository, and the corresponding properties are populated by the data generator with random data that matches certain criteria.

public class BlogPostRepositoryMock : BlogPostRepository
{
  public BlogPostRepositoryMock()
  {
    DataGenerator builder = new DataGenerator();

    int numberOfItems = builder.Integer(30, 100);
    for (int i = 0; i < numberOfItems; i++)
    {
      BlogPost mock = new BlogPost();
      mock.Id = i + 1;
      mock.Title = builder.Title();
      mock.Body = builder.Paragraphs();
      mock.CreatedDate = builder.Date(new DateTime(2003, 1, 31));
      mock.ModifiedDate = builder.DateOptional(mock.CreatedDate);
      
      base.AddItem(mock);
    }
  }
}

As our main blog post repository is only a repository around an in-memory collection, we can simply extend it. In the class constructor we fill the collection of the base class with some fake data which is provided by the not yet existent DataGenerator.

We are referring to a class that does not exist yet, the DataGenerator, in TDD manner. This approach is nice because we have to think about how possible clients may gonna use our class, even before starting the actual implementation. By doing so, on one side we clarify the detailed requirements at an early stage. On the other side this approach normally also leads to nice signatures and a clean API.

Implementation

After analyzing the mock repository implementation we are able to extract the requirements: The generator should provide methods for returning random values for the build-in language types (i.e. Integer, Boolean, DateTime), but also for returning random words taken from an dictionary. An alternative approach would be to concatenate characters randomly instead of using a predefined dictionary.

This leads us to the following IDataGenerator interface that provides methods returning primitive types but also some nice convenience methods for building text fragments.

public interface IDataGenerator
{
  bool Boolean();
  bool? BooleanOptional();
  DateTime Date(DateTime minValue, DateTime? maxValue = null);
  DateTime? DateOptional(DateTime minValue, DateTime? maxValue = null);
  int Integer(int minValue, int maxValue);
  int? IntegerOptional(int minValue, int maxValue);
  IEnumerable<TItemType> Items<TItemType>(IEnumerable<TItemType> items, int minNumberOfItems = 1, int maxNumberOfItems = 1);
  string Paragraph(int minNumberOfSentences = 3, int maxNumberOfSentences = 15);
  string Paragraphs(int minNumberOfParagraphes = 1, int maxNumberOfParagraphes = 5, int minNumberOfSentences = 3, int maxNumberOfSentences = 10);
  string Sentence(int minNumberOfWords = 5, int maxNumberOfWords = 20);
  string Sentences(int minNumberOfSentences = 1, int maxNumberOfSentences = 5, int minNumberOfWords = 5, int maxNumberOfWords = 20);
  string Title(int minNumberOfWords = 1, int maxNumberOfWords = 3);
  string Word();
  string Words(int minNumberOfWords, int maxNumberOfWords);
}

The methods returning primitive types are straight forward to implement. They follow the same principle that each method actually consists of two, one for creating a nullable value and one that actually returns the random value.

public int? IntegerOptional(int minValue, int maxValue)
{
  if (Next(2) == 1)
  {
    return Integer(minValue, maxValue);
  }

  return null;
}

public int Integer(int minValue, int maxValue)
{
  Assert(minValue < maxValue, "minValue must be lesser than maxValue");

  return Next(minValue, maxValue);
}

The method returning random date times might also be of interest:

public DateTime Date(DateTime minValue, DateTime? maxValue = null)
{
  if (maxValue == null)
  {
    maxValue = DateTime.Now;
  }

  Assert(minValue > DateTime.MinValue, "minValue must be greater than DateTime.MinValue");
  Assert(minValue < DateTime.MaxValue, "maxValue must be lesser than DateTime.MaxValue");
  Assert(minValue < maxValue, "minValue must be lesser than maxValue");

  double rand = (maxValue.Value.Ticks * 1.0 - minValue.Ticks * 1.0) * NextDouble() + minValue.Ticks * 1.0;
  long ticks = Convert.ToInt64(rand);

  DateTime result = new DateTime(ticks);

  return result;
}

The text building parts are based on an in-memory dictionary. The methods depend on each other which leads to a grammer that can be described in the Extended Backus-Nauer Form (EBNF).

Paragraphs          = Paragraph, { Paragraph }
Paragraph           = Sentences, Paragraph_Delimeter
Paragraph_Delimeter = '\n' | '\r\n'
Sentences           = Sentence, { Sentence }
Sentence            = Words, Sentence_Delimeter
Sentence_Delimeter  = '.' | '!' | '?'
Title               = Words
Words               = Word, { Word }

The above grammar can be adopted one-to-one by our data generator. All methods somehow in the end rely on the method Word.

private static readonly string[] WORDS = new string[] { "the", "of" };
public string Words(int minNumberOfWords, int maxNumberOfWords)
{
  Assert(minNumberOfWords >= 0, "minNumberOfWords must be greater than zero");
  Assert(minNumberOfWords <= maxNumberOfWords, "minNumberOfWords must be lesser or equal than maxNumberOfWords");

  int numberOfWords = Next(minNumberOfWords, maxNumberOfWords + 1);

  StringBuilder builder = new StringBuilder();
  for (int i = 0; i < numberOfWords; i++)
  {
    builder.Append(i < 0 ? " " + Word() : Word());
  }

  return builder.ToString();
}

public string Word()
{
  int wordIndex = Next(WORDS.Length);
  return WORDS[wordIndex];
}

The remaining bits are self-explanatory, but feel free to download my working implementation.

Outlook

We mentioned in the first paragraph that we will not go into faking APIs. Anyhow I would like to refer to some interesting projects and eventually I am going to write an article about this topic: