Intro
In this article I will show how I handled synchronization of offline-first data between the Cashflow Calendar app and Firebase’s Firestore.
Firebase is a tool offered by Google that can greatly reduce the time it takes to get your application up and running. It is a platform as a service (PaaS) which provides a lot of functionality out of the box that most apps usually need… such as authentication, file storage and a database – all for free! (up to a point). The downside with Firebase is that as soon as your app starts to get substantial traffic there is a steep cost incurred for these services.
Cashflow Calendar is an app that I recently released for iOS. Originally this app was strictly going to be an offline app with no online capabilities (for the first version)… but things change. The eventually feature set that I landed on needing was to be able to have multiple devices adding and modifying transactions to a single account (grouping of transactions).
Now when this online feature came to fruition the app’s persistence layer had already been fully created and was already undergoing rigorous testing (I also used GRDB for this app’s persistence! See Commission App pt. 1 - SwiftUI & Persistence for more information on how I set up persistence for that app).
This meant that I wasn’t going to re-write a huge and vital portion of the app to accommodate this feature; instead I had to make this feature fit in with the current architecture.
Offline Mode
You might be asking since I was using Firebase why I didn’t just use Firebase’s offline mode?
Besides the obvious penalty of using more Firebase requests and not having fine-grained control how when to fetch what… in my specific scenario:
- An existing persistence layer was already being used in the app
- Using Firebase for everything makes it very central to the app and potentially difficult to refactor out of your app in the future. With my way I can easily swap Firebase out for something else if needed
Strategy
As I said before I was going to make this new online feature fit into the existing architecture. So I decided to continue to record our transactions locally (offline) to a SQLite database and at set intervals the app would sync with Firebase which would involve pushing up any dirty
records and pulling down any records we didn’t have locally.
The strategy for handling synchronization conflicts would be to use the last-write-wins strategy. What this is entails is as it sounds – if, during a sync process, there are modifications found for a record locally as well as on the server (Firebase) the ‘winner’ would be the record which was modified at the latest time.
Model
The strategy of marking a record as dirty
to synchronize required the need for a few properties to be added to my models.
- An
isDirty
boolean indicating it’s been modified or not - An
updatedAt
time stamp to be used as a comparison when syncing - A
serverId
string which in the case of Firebase would be the document identifier
The serverId
is required due to the fact that my local ids
are generated by auto-increment when writing to the SQLite database. So when we sync with Firebase we need to use a different external identifier which represents our local record.
Also, for my specific case, since not all models were to be synchronized by default – another flag was needed: isShared
. This property allows the synchronization process a means to filter out local objects in the database that do not need syncing.
Lets look at the Account structure as an example:
struct Account: Codable {
var id: Int64?
var userId: String?
var name: String
var accountType: AccountType
var isShared: Bool // Determines if this account should be considered for syncing
var serverId: String? // External identifier
var isDirty: Bool // Marks this account as needing to be sync'd
var createdAt: Date
var updatedAt: Date // Used to compare Accounts when in conflict
...
}
View Model
The adjustments that were needed in my view models responsible for updating a record in the database:
- Ensure that I set the
updatedAt
date correctly - Ensure that I mark the record’s
isDirty
flag to true
Voila! All other update logic stays the same.
Sync Process
I wanted the synchronization process to ‘run in the background’ on a set interval. So in my Synchronization class I setup a Timer
!
final class SyncService {
private let appDatabase = AppDatabase.shared
private let queue = OperationQueue()
private var timer: Timer?
private var listeners: [ListenerRegistration] = []
private var isAccountShareEnabled: Bool {
Configuration.shared.isAccountShareEnabled
}
init() {
guard isAccountShareEnabled else {
return
}
// ๐
queue.maxConcurrentOperationCount = 1
setupConnectedAccountListeners()
sync()
timer = Timer.scheduledTimer(
timeInterval: 60,
target: self,
selector: #selector(sync),
userInfo: nil,
repeats: true)
}
...
As shown, the sync()
method gets setup to be called every 60 seconds in the initialization of this class.
The sync method:
@objc
func sync() {
// ... We'll come back to this barrier block operation ๐
queue.addBarrierBlock { [weak self] in
guard let self = self else { return }
var allEntries: [EntryAccount] = []
var dirtyEntries: [EntryAccount] = []
do {
// 1. Attempt to fetch all local 'Entries' which are marked as dirty
try self.appDatabase.databaseReader.read { db in
var query: QueryInterfaceRequest<Entry> = Entry
.order(Entry.Columns.entryDate.asc)
query = query
.including(required: Entry.account
.filter(Account.Columns.isShared == true))
.including(optional: Entry
.linkedEntry
.including(optional: Entry.linkedAccount))
allEntries = try query
.asRequest(of: EntryAccount.self)
.fetchAll(db)
dirtyEntries = try query
.filter(Entry.Columns.isDirty == true)
.asRequest(of: EntryAccount.self)
.fetchAll(db)
}
} catch {
Logger.shared.error(error.localizedDescription)
}
guard dirtyEntries.count > 0 else { return } // we done here...
Logger.shared.debug("\(dirtyEntries.count) entries are dirty and need some syncing")
// 2. We don't need these synchronizations to run serially. add them to a op queue and wait for all to finish
let syncOpQueue = OperationQueue()
let asyncOperations = dirtyEntries.map {
SyncTransactionOperation(entryAccount: $0)
}
syncOpQueue.addOperations(asyncOperations, waitUntilFinished: true)
// 3. This array will contain all 'Entries' that were successfuly sync'd with Firebase
// The reason this is an array is a side-effect of this app's 'business logic' -- most cases you'd have a single record sync'd for each dirty local record...
let entriesSynced = asyncOperations.flatMap { $0.entriesSynced }
do {
try self.appDatabase.dbWriter.write { db in
for idx in 0..<entriesSynced.count {
let syncedEntry = entriesSynced[idx]
guard let entryId = syncedEntry.id else { continue }
// this will give us the most up-to-date dirtyEntry
guard let dirtyEntry = try Entry
.filter(id: entryId)
.fetchOne(db) else {
return
}
// 4. Update our local 'Entry' with the latest sync from FB
try syncedEntry.updateChanges(db, from: dirtyEntry)
Logger.shared.debug("sync'd entry saved to db.")
}
}
} catch {
Logger.shared.error(error.localizedDescription)
}
}
}
As you can see there’s a lot going on in this method but it’s fairly simply. Let’s dissect it and call out any oddities:
- We initially attempt to fetch all local records that have been updated and marked as
dirty
- For all local records that are in need of synchronizing I add them to an OperationQueue which allows the actual sync work of each record to be run concurrently
- The
SyncTransactionOperation
is responsible for doing actual create or update work required with Firebase - Be sure to update your local records to reflect that they are no longer dirty as well as any properties that need changing (if the local record’s data needs to be overwritten with new remote data)
Overview of SyncTransactionOperation's
logic: if the local record doesn’t have a serverId
then we need to create a remote record. If there is a serverId
then we need to fetch the remote version of our record and compare our updatedAt
timestamps – if the remote was modified more recently than the local record we need to overwrite out local record and vice-versa if the local record was modified more recently.
For those of you paying attention you will have noticed that this synchronization process only addresses updating or creating the records we have locally ๐ง
So what about remote records that we don’t have a locally… or deletions?!
Firebase Query Limitation
Initially I attempted to include in the main sync process: remote records that have been created and we have no record of locally as well as deletions (we have a record locally with a serverId
but is no longer found remotely).
I had missed this in Firebase’s documentation but was quickly brought up to speed with the limitation of Firebase’s Firestore OR queries and how you are not able to add more than 10 arguments.
Use the in operator to combine up to 10 equality (==) clauses on the same field with a logical OR.
Emphasis being on that 10 – so when I was initially trying to do something like:
let db = Firestore.firestore()
db.collection("accounts/\(accountServerId)/transactions")
.whereField("serverId", notIn: entryServerIds)
.getDocuments { snapshot, error in
guard let snapshot = snapshot else {
Logger.shared.error(error?.localizedDescription ?? "Failed to fetch account entries (for creation)")
return
}
...
With my expectation being that I would receive all remote entries in the snapshot
that I didn’t have locally – but as soon as my entryServerIds
array was greater than 10 this query would error out.
That entryServerIds
variable was an array containing all of my local record server identifiers. ๐
Firestore Listener
To handle remote creations and deletions I opted to use a Firestore Listener! If you remember in our SyncService
class’s initialization we called a setupConnectedAccountListeners()
method. This is where we… setup the listeners!
func setupConnectedAccountListeners() {
guard isAccountShareEnabled else {
return
}
// clear any previous listeners
for listener in listeners {
listener.remove()
}
var serverIds: [String] = []
// 1. Fetch all accounts that are 'share-enabled' and require a listener
do {
try self.appDatabase.databaseReader.read { db in
let accounts = try Account
.filter(Account.Columns.isShared == true)
.fetchAll(db)
serverIds = accounts.map { $0.serverId }.compactMap { $0 }
}
} catch {
Logger.shared.error(error.localizedDescription)
}
let db = Firestore.firestore()
for serverId in serverIds {
let listener = db.collection("accounts/\(serverId)/transactions").addSnapshotListener { [weak self] snapshot, error in
self?.accountTransactionsListenerHandler(
snapshot: snapshot,
error: error)
}
listeners.append(listener)
}
}
As we were unable to query for all remote records these listeners handle notifying us anytime a remote record is updated, created, or deleted.
Another thing to note is that when attaching to a listener it will trigger the initial snapshot handler with all child records.
private func accountTransactionsListenerHandler(snapshot: QuerySnapshot?, error: Error?) {
// ๐
queue.addOperation { [weak self] in
guard let self = self else { return }
guard let snapshot = snapshot else {
Logger.shared.error(error?.localizedDescription ?? "accountTransactionsListenerHandler error")
return
}
// 1. Fetch all local 'Entries'
var allEntries: [EntryAccount] = []
do {
try self.appDatabase.databaseReader.read { db in
var query: QueryInterfaceRequest<Entry> = Entry
.order(Entry.Columns.entryDate.asc)
query = query
.including(required: Entry.account
.filter(Account.Columns.isShared == true))
.including(optional: Entry
.linkedEntry
.including(optional: Entry.linkedAccount))
allEntries = try query
.asRequest(of: EntryAccount.self)
.fetchAll(db)
}
} catch {
Logger.shared.error(error.localizedDescription)
}
// 2. Create a mapping between server identifiers and local identifiers
let entryIdMappings = allEntries.reduce(into: [String: Int64]()) { result, entryAccount in
if let entryServerId = entryAccount.entry.serverId,
let localEntryId = entryAccount.entry.id {
result[entryServerId] = localEntryId
}
if let linkedEntryServerId = entryAccount.linkedEntry?.serverId,
let localEntryId = entryAccount.linkedEntry?.id {
result[linkedEntryServerId] = localEntryId
}
}
var entriesToCreateOrUpdate: [Entry] = []
var localEntriesForDeletion: [Int64] = []
for change in snapshot.documentChanges {
// 3. We can limit to the 'change types' we are interested in
guard change.type == .added || change.type == .modified else {
continue
}
let data = change.document.data()
// 4. Shared entries that are deleted are marked as deleted (remotely only)
if let isDeletedFlag = data["isDeleted"] as? Bool,
isDeletedFlag {
guard let serverId = data[Entry.Columns.serverId.name] as? String else {
continue
}
guard let localId = entryIdMappings[serverId] else {
Logger.shared.error("No match on our serverid/localid mappings for serverid: \(serverId) -- we've already deleted it probably ")
continue
}
localEntriesForDeletion.append(localId)
continue
}
// Certain properties are required in order to accept this entry locally... reduced for brevity
guard let serverId = data[Entry.Columns.serverId.name] as? String,
let userId = data[Entry.Columns.userId.name] as? String,
let accountServerId = data[Entry.Columns.accountId.name] as? String,
let accountLocalId = accountIdMappings[accountServerId],
...
Double else {
continue
}
// If localId is not nil, this means we have this entry locally. It's a potential update
let localId: Int64? = entryIdMappings[serverId]
let linkedEntryServerId = data[Entry.Columns.linkedEntryServerId.name] as? String
var linkedEntryId: Int64?
// see if we have this entry in our local db..
if let linkedEntry = self.appDatabase.getEntry(serverId: linkedEntryServerId) {
linkedEntryId = linkedEntry.id
}
let createdAtTimeStamp = data[Entry.Columns.createdAt.name] as? Double
let updatedTimeStamp = data[Entry.Columns.updatedAt.name] as? Double
let entry = Entry(id: localId,
serverId: serverId,
...
updatedAt: updatedTimeStamp != nil ? Date(timeIntervalSince1970: updatedTimeStamp ?? 0) : Date())
Logger.shared.debug("An entry has been found REMOTELY that needs to be created or updated locally:: ServerId:\(serverId)")
entriesToCreateOrUpdate.append(entry)
} // End for changes loop
// 5. Write to database (can ignore only included for completeness' sake):
Logger.shared.debug("Deleting \(localEntriesForDeletion.count) entries")
do {
try self.appDatabase.dbWriter.write { db in
for localEntryId in localEntriesForDeletion {
try self.appDatabase.deleteEntry(entryId: localEntryId, db: db)
Logger.shared.debug("entry deleted from local db.")
}
}
} catch {
Logger.shared.error(error.localizedDescription)
}
Logger.shared.debug("Creating or Updating \(entriesToCreateOrUpdate.count) entries found on remote that we do not have saved locally or potentially need to update.")
do {
try self.appDatabase.dbWriter.write { db in
for var entryToCreateOrUpdate in entriesToCreateOrUpdate {
guard let localId = entryToCreateOrUpdate.id else {
/// CREATE NEW LOCAL ENTRY FROM REMOTE ENTRY DATA
try self.appDatabase.saveEntry(&entryToCreateOrUpdate, db: db)
Logger.shared.debug("remote entry created in local db.")
continue
}
guard let localEntry = try? Entry.fetchOne(db, id: localId) else {
Logger.shared.error("A remote entry has had an unknown local id set...")
continue
}
guard localEntry.updatedAt < entryToCreateOrUpdate.updatedAt else {
Logger.shared.debug("Ignoring remote entry as local entry has been modified more recently.")
continue
}
/// UPDATE LOCAL ENTRY FROM NEW REMOTE ENTRY DATA!
try entryToCreateOrUpdate.updateChanges(db, from: localEntry)
Logger.shared.debug("Updated a local entry with new data from matching remote entry.")
}
}
} catch {
Logger.shared.error(error.localizedDescription)
}
}
}
Woo a big nasty function that! I’ll highlight the numbered comments I added in an attempt to help break down what’s going on:
- Fetch all local ‘Entries’ that are under a ‘shared’ account
- Create a mapping between server identifiers and local identifiers. This mapping allows us a convenient way to look up Entries by local and server identifiers
- Firebase tells us what kind of change has occured that triggered this listener. We can limit to the ‘change types’ we are interested in
- As I didn’t want to do any diff’ing logic I opted to add a ‘isDeleted’ flag that is added to any remote Entry that has been deleted. Any remote entry found to have that flag set to true we will delete our local version of it (regardless of timestamp)
- Write to local database! This section of code can be ignored… basically all we are doing is either deleting, creating or updating local entries based on the entries found in our 2 arrays
entriesToCreateOrUpdate
,localEntriesForDeletion
For you eagle-eyed readers out there, you might have noticed that there’s an issue with using a Firestore listener combined with another process that creates and updates remote Firestore documents…
Race Condition
After adding the Firestore listener and began to test my beautiful synchronization system I noticed that as soon as I marked a local record as dirty
and a sync process was kicked off – I would end up with duplicates of all records marked as dirty… and would continue to duplicate every 60 seconds ๐ฌ.
…A couple of coffees โ๏ธ later…
What was happening was during the sync process when a local record was being pushed to the remote… as soon as that remote record was created or updated our Firestore listener’s handler was triggered (DUH).
So as soon as we created a new record in Firestore, before the local version even had a chance to be written to the local database with the remote serverId
and set the isDirty
flag to false… our listener’s handler was receiving treating this newly-created Entry as one we did not have locally (due to there being no matching serverId
).
My simple solution was to add another OperationQueue in the mix. (I love OperationQueues…) This OperationQueue would need to be a property on the SyncService
class and it would be responsible for coordinating the flow of work being received from our sync timer and our Firestore listener.
๐ i.e.
private func accountTransactionsListenerHandler(snapshot: QuerySnapshot?, error: Error?) {
queue.addOperation { [weak self] in
...
}
}
...
@objc
func sync() {
queue.addBarrierBlock { [weak self] in
...
}
}
Note the sync()
method adds a barrier block to the queue which, as it’s name implies, acts as a barrier– where any subsequent operations added to the queue won’t be started until the blocking operation is completed.
From Apple’s documentation for addBarrierBlock(_:)
Invokes a block when the queue finishes all enqueued operations, and prevents subsequent operations from starting until the block has completed.
For good measure I also set the queue’s maxConcurrentOperationCount
to 1.
Conclusion
After an eventful but sweet journey I ended up with a synchronization process that seems to work okay and that allowed me to keep my existing persistence layer intact! ๐
As with any first pass… there are probably some areas of this process that can be improved on; and maybe in the future those areas will be addressed! If they do I will be sure to update this post with a link.
Also, there are some pitfalls when it comes to using Firebase but if you keep an eye out for them you should be alright! Overall I am still very happy with Firebase due it’s overall convenience and ability to speed up a project massively.