Keeping Syncing Simple

 A while back I was faced with a choice: how do I sync data between a mobile app and a server-side database? I made the wrong choice. Here's how it happened.

tldr: think about your priorities; avoid the temptation to over-engineer; keep things simple

The Goal

The situation was not a complex one: only one way syncing was required. The server had a list of lessons, and my app had to stay up to date and present these to the user.

There were a few push updates that could occur. The most common would be that a new lesson was published and the user would be alerted with a Push Notification. The other cases were when a lesson was deleted or updated, in which case the user didn't need to be alerted.

The Two Obvious Choices

The most obvious solution was just to have the app pull all the lessons each time. The alternative was to just send the changes along (known as 'deltas').

The former seemed simple but wasteful. Every time the app updated, it would pull all the data down from the server. If there was an update, most of this data would be the same. If nothing had updated, it would be exactly the same as what was stored on the phone.

The alternative seemed much better. It seemed more efficient, cleaner and elegant. The data for each lesson would be sent in the Push Notification. I created another instruction for deleting lessons. Updating a lesson was done just by deleting the previous lesson and sending a new one - it would mess up the order, but that could always be fixed in the future.

Why My First Choice Was A Bad One

Immediately I started noticing bugs. Lessons appearing twice or lessons not appearing at all. Imagine what a bad user experience it would be to click on a Push Notification alerting you to a new lesson only for no new lesson to appear. Part of this occurred because I was using React Native and Push Notifications. A Push Notification could arrive while the user was using the app, it could be clicked on or it could be swiped away; all these resulted in different information being sent to the app. The behavior was different on iPhone and Android.

I had another realization. I had no way of knowing the state of any given phone. Even for the phone in my my hand - if it was out of sync, I had no access to that data. I added a secret button that would display the JSON data when pressed, to help me debug. It was hard enough to debug issues with my test devices - there was no way for me to 'simulate' one of my users' experiences.

Rethinking My Priorities

It was only at this point that I decided to write down my priorities. They were to minimize developer time and to minimize bugs. I wanted a quick way to view the state of any user from a device in my hand. I realized that the amount of data being sent was not significant, so minimizing the amount of data being sent want not a priority.

It was obvious. I needed to go to the 'dumb, wasteful' way of pulling all the data down each time. It didn't take long to implement - in fact less time than I'd been wasting trying to debug my original 'delta' solution.

Conclusion

The temptation to over-engineer was great. At Google, they always had a complex over-engineered solution, so it seemed prudent to do the same, not considering what might happen in practice. I was left with two take-aways. Rather than jumping into a solution, actually have a think about my priorities when adding a new feature and make my decision accordingly. Second, simple is better. The more my mind was getting involved thinking about efficient deltas and hash values, the more I wanted to implement that solution, even when the more simple solution would have avoided all of that.

Comments

  1. A lot of people suggested Firebase, which is good in some situations but has limited control. Here were some good comments from Reddit:

    erinaceus_
    There are also alternative yet not that complicated ways to solve this, if performance ever becomes an issue.

    For instance, keep a 'lastChange' timestamp on each lesson. Make the poling call also specify a lastChange query parameter, based on the most recent lesson that the front-end has in memory. For the server: if the query parameter and the current most recent lesson match, don't send anything (e.g. a 204 response). If they don't, send the full list (in a 200 response).

    daveismith
    If using HTTP, use the built in mechanisms. Etags and If-None-Match lets the server return a 304 not modified if the client reported etag is the same as the version on the server.

    ReplyDelete

Post a Comment