An App From Scratch: Part 11 – Template Data Modelling and Test Optimizations

9 minute reading time; ~1670 words


Hello and welcome to this post!

Today we’re going to touch on some database design concepts as we explore how to store and manage the records created with a template. We’ll also look at some testing optimizations to improve performance, and thus, development speed.

Let’s get into it!

If you haven’t read the previous posts, I’d suggest you start at An App From Scratch: Part 1 – What To Build. You can also find all the related documents and code here: Building Tailgunner.

What To Do

This post, we’re going to dig into a piece of US2, which focuses on entering data. I want to lay a solid foundation for working with data, so we’re going to:

  • Create and implement the database design to hold records.
  • Create a way to view the records for a specific template.

Database Design

This tool allows the user to define custom forms, and collect data in any format they’d like. This of course introduces an interesting question of how to store the data that’s input. There are a few factors to consider:

  • Data size
  • Data access patterns
  • Partitioning and security
  • Scalability

In thinking about these things, we come up with these observations:

  • The records are likely to be small, unless they use text fields.
  • The structure of the records will vary.
  • The initial intent of the system is to manage data, and provide useful exports.
    • This tool isn’t a spreadsheet, intended to manipulate data with formulas.
  • A user could have anywhere from dozens to thousands of records.

With those things in mind, the initial models that came to mind were:

  • Document Store
    This stores each record as a simple JSON blob.
  • Schema-per-Entity (SPE)
    With this, we would create a table builder, and generate a database table customized to that user’s template when they save.
  • Entity-Attribute-Value (EAV)
    In this model, you can store any structure of records you want, and each value becomes one row in the table.

Each model has advantages and trade-offs, so let’s walk through them in a little more detail.

Document Store

I initially considered this because it’s easy to use, and there is database support for indexing and querying JSON. However, I didn’t like the storage inefficiency, where every single record you save is also saving the structure along with it.

Record Table

User IDTemplate IDJSON Data
11{"title": "Food Inc.", "tags": ["documentary", "news"], "release_year": 2008, "length_minutes": 93, "description": "An unflattering look inside America's corporate controlled food industry."}

Schema-per-Entity (SPE)

I considered this structure because it meets multiple criteria. It’s easy to secure, because data never gets mixed in the same tables. It’s easy to scale and manage because it offloads a lot of responsibilities to the database, allowing us to take full advantage of indexing and querying capabilities.

The reason I didn’t choose this comes down to complexity. To work with it, I would have to build a set of tools to actually create, manage, and delete the database tables. This is actually my favored approach, but I won’t be shifting to it until later in the project when I have a stronger need for the benefits.

Record_1_4_data

User IDTemplate IDTitleTagsRelease_YearLength_MinutesDescription
14Food Inc.["documentary", "news"]200893An unflattering look inside America's corporate controlled food industry.

Entity-Attribute-Value (EAV)

To me, this is the balance point between the previous two approaches. It’s flexible like the document store, and takes better advantage of the database structures. Although this approach is falling out of favor as many developers shift to JSON stores, it made sense for this project at its current scale.

The main drawback is that complex queries and indexing are more challenging. However, at this scale, I consider it a reasonable trade-off for the ease of implementation.

Record Table

Record IDUser IDTemplate ID
114

Record Data Table

Record IDTemplate Field IDString_ValueText_ValueInteger_Value
11Food Inc.NULLNULL
12NULL["documentary", "news"]NULL
13NULLNULL2008
14NULLNULL93
15NULLAn unflattering look inside America's corporate controlled food industry.NULL

Supporting The Data Model

To support this data model, we need to modify our Template Record Value model. In addition to the standard model, we’ll also need a way to map our template data types to columns and back. To support this, I’ve added a list of types, and a few functions to let us access and handle these fields easily.

app/Models/TemplateRecordValue.php
private const TYPE_TO_COLUMN_MAP = [
    'integer' => 'integer_value',
    'string' => 'string_value',
    'text' => 'text_value',
    'dropdown' => 'string_value',
    'checkbox' => 'string_value'
];

/**
 * Get database column name for storing template field type.
 */
public static function getColumnForType(string $fieldType): string
{
    return self::TYPE_TO_COLUMN_MAP[$fieldType] ?? 'string_value';
}

/**
 * Get value based on field type
 */
public function getTypedValue(string $fieldType): mixed
{
    $column = self::getColumnForType($fieldType);
    return $this->$column;
}

/**
 * Set value based on field type
 */
public function setTypedValue(string $fieldType, mixed $value): void
{
    $column = self::getColumnForType($fieldType);
    $this->$column = $value;
}

The Record Transformation

Along with the changes to the database model, we need another tool: a way to convert the individual record values into a cohesive record.

This class takes a list of values, and returns a list of records in a user-friendly format, making them easier to work with.

Some interesting pieces about this are:

  • The class is marked final, preventing its constants, methods, or properties from being overridden by subclasses. Since all the methods are static, allowing overrides is unnecessary, making this a good safety mechanism.
  • The class has a private constructor. Since everything it does is stateless (no function needs to maintain data outside itself), this prevents anyone from trying to create a stateful instance of it.
PHP
final class TemplateRecordTransformer
{
    /**
     * Prevent instantiation
     */
    private function __construct() {}

Create Query Scope

One of the things we need to do in multiple places is to access the template with fields and metadata. To simplify this, I’ve added a Laravel local scope, which is a function to add a prepackaged set of constraints to a query .

app/Models/Template.php
/**
 * Get template with field information.
 *
 * @param \Illuminate\Database\Eloquent\Builder $query
 * @param bool $withExtendedFields Include all commonly fields.
 * @param bool $withAllFields Include all fields.
 */
public function scopeWithMetaAndFields($query, $withExtendedFields = false, $withAllFields = false)
{
    // The base fields
    $fieldsInfo = [
        'id',
        'template_id',
        'label',
        'name',
        'type',
        'order',
    ];

    // Get the commonly used extended options
    if ($withExtendedFields || $withAllFields) {
        $fieldsInfo = array_merge(
            $fieldsInfo,
            ['extended_options']
        );
    }

    // Get everything
    if ($withAllFields) {
        $fieldsInfo = array_merge(
            $fieldsInfo,
            ['created_at', 'updated_at']
        );
    }

    return $query
        ->select(['id', 'title', 'description'])
        ->with([
            'fields' => function($query) use ($fieldsInfo) {
                $query->select($fieldsInfo)
                ->orderBy('order');
            }
        ]);
}

With this scope, we can replace the conditions from lines 37 to 43 with a single query (line 1 below), making our code more consistent and readable.

PHP
$template = Template::withMetaAndFields()
    ->where('user_id', auth()->id())
    ->findOrFail($request->template);

Testing Optimizations

During the testing phase, I made several optimizations to improve efficiency and speed.

Multi-Key Array Assert

I was surprised to find that while PHPUnit has an assert to check that an array contains a key, it lacks a version that you can use with multiple keys, leading to code like this:

PHP
$loadedFieldArray = $loadedTemplate->fields->first()->toArray();
$this->assertArrayHasKey('id', $loadedFieldArray);
$this->assertArrayHasKey('template_id', $loadedFieldArray);
$this->assertArrayHasKey('label', $loadedFieldArray);
$this->assertArrayHasKey('name', $loadedFieldArray);
$this->assertArrayHasKey('type', $loadedFieldArray);
$this->assertArrayHasKey('order', $loadedFieldArray);

Obviously, this is non-optimal, since it repeats itself a lot, so we’re going to add a multi-key assert function to our code. This goes in the base TestCase code, so that we can use it in any of our tests.

src/tests/TestCase.php
<?php

namespace Tests;

use Illuminate\Foundation\Testing\TestCase as BaseTestCase;

abstract class TestCase extends BaseTestCase
{
    /**
     * Assert that an array has the specified keys.
     */
    protected function assertArrayHasKeys(array $keys, array $array): void
    {
        foreach ($keys as $key) {
            $this->assertArrayHasKey($key, $array);
        }
    }
}

With this, we can now simplify the asserts from above into this code, making it easier to read and reason about!

PHP
$loadedFieldArray = $loadedTemplate->fields->first()->toArray();
$this->assertArrayHasKeys([
    'id',
    'template_id',
    'label',
    'name',
    'type',
    'order',
], $loadedFieldArray);

Initialize Once, Use Many

In some of our tests, we initialized some required components (like a template) for every test. Since RefreshDatabase rolls back database changes after each test (with transactions), we can optimize the tests by creating the shared components only once. Since the template isn’t the object under test, there’s no point in re-creating it every time.

In this test below, you can see how that’s implemented. First we set up a variable to hold a template (line 14 – 15), and then initialize it as part of our setup (only once, line 21 – 24). This lets us take out the repeated creation on each test (line 29), and reuse one template when we need it (line 32, 40).

PHP
<?php

namespace Tests\Unit\Models;

use App\Models\Template;
use App\Models\TemplateField;
use Illuminate\Foundation\Testing\RefreshDatabase;
use Tests\TestCase;

class TemplateFieldTest extends TestCase
{
    use RefreshDatabase;

    private Template $template;
    private bool $templateCreated = false;

    protected function setUp(): void
    {
        parent::setUp();

        if (!$this->templateCreated) {
            $this->template = Template::factory()->create();
            $this->templateCreated = true;
        }
    }

    public function test_can_create_template_field()
    {
        //$template = Template::factory()->create();

        $field = TemplateField::factory()->create([
            //'template_id' => $template->id,
            'template_id' => $this->template->id,
            'name' => 'Test Field',
            'type' => 'text'
        ]);

        $this->assertDatabaseHas('template_fields', [
            'id' => $field->id,
            //'template_id' => $template->id,
            'template_id' => $this->template->id,
            'name' => 'Test Field',
            'type' => 'text'
        ]);
    }
}

The Pull Requests

Wrapping Up

This time around, we got into the technical design decisions around how to structure what appears on the surface to be simple data. Every day, engineers have to weigh the pros and cons of many decisions like this, hoping their choices won’t lead to issues down the line.

While there’s rarely a perfect answer, the goal is to choose as wisely as we can with the context and constraints we know, so we don’t paint ourselves into a corner down the road.

As always, I hope you learned something interesting through this post, and stay tuned for the next!

Leave me a comment if you found this helpful, or you found something I could do better!

Cheers!


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *