How to avoid duplication of critical data

While developing apps, software developers often have to be sure that a database won't have duplicated data. Users are one of the best example. An username or an email has to be unique for one app. It might seem to be a trivial issue but for some cases, a solution is more sophisticated.

The problem

class UsersController extends Controller
{
    public function store(Request $request, Faker $faker) {
        $email = $request->input('email');

        DB::transaction(function () use ($faker, $email) {
            $userExists = User::where('email', $email)->exists();

            if($userExists) {
                abort(403);
            }

            // it simulates more complex logic which takes more time
            sleep(10);

            $user = new User();
            $user->email = $email;
            $user->name = $faker->name;
            $user->password = Hash::make($faker->password);
            $user->save();
        });
    }
}

I've created this simple method to store users in DB. I check if a user with the given email exists in line 7. It should protect against duplication, but what would happen if I ran 2 requests with the same email at the same time?

duplicated-data

Whoops, our users are duplicated. It happens because some time passes between checking if the user exists and saving a new user. How is it solved?

Solution 1

The first solution is easy. You can just add a unique index. If the given email exists in DB, you'll have a database error, and data won't be added. In Laravel, you can add a simple migration:

Schema::table('users', function (Blueprint $table) {
    $table->string('email')->unique();
});

This solution is fine but not perfect for all cases. Try to imagine situation that you offer SasS. Your customer can decide an email OR a username has to be unique. For this case, you can't use a unique index because your app can have duplicated emails in the whole table.

Solution 2

The second solution is locking this part of the code. Then, you'll be sure that only one request can be processed with the given email. Please have a look on my modified code:

class UsersController extends Controller
{
    public function store(Request $request, Faker $faker) {
        $email = $request->input('email');

        $lock = Cache::lock('storeUser-' . $email, 15);
        if(!$lock->get()) {
            abort(403, 'The given email is processed');
        }

        DB::transaction(function () use ($lock, $faker, $email) {
            $userExists = User::where('email', $email)->exists();

            if($userExists) {
                $lock->release();
                abort(403);
            }

            // it simulates more complex logic which takes more time
            sleep(10);

            $user = new User();
            $user->email = $email;
            $user->name = $faker->name;
            $user->password = Hash::make($faker->password);
            $user->save();

            $lock->release();
        });
    }
}

I use Atomic Locks to lock the code. In the first request, the lock is acquired for a maximum of 15 seconds. Then, it's released after adding the user. During the next requests, $lock->get() checks if the lock still exists. If it exists, the code stops executing. If the lock is released, the code acquires the next lock. Now I'm sure only one request can process creating users using the email.

Comments

Blog Comments powered by Disqus.

Older Post Newer Post