Diff in iOS [Part 2]

on

Hi, it’s part 2 of my LCS series. Hope you have a great day 😇

In this blog, I would like to cover how we can apply LCS (aka. Diff) into iOS Development. It’s so boring if I keep writing blog with “deep” theory, and no place for enjoying.

Before diving, we should try a real problem I ensure everybody will encounter if you’re iOS S.E.

Let imagine, we have an app, same philosophy with Instagram. We have Feed screen, where we can see all post of your friends, your following.

On the top-right of navigation bar. We have pull-to-refresh button. Whenever you tap it. It will get new data from server, then filling data into UITableView.

Your requirement is figured it out the way to “reload” table view with less effort as much as possible. 😉

Solution 1: Force reloadData()

Don’t care any things, just call

In particular scenario, it might be the best solution if you have < 10 items, and the hierarchy of cell’s subviews are simple. But what if your app grows up quickly, and have around 100 items, each TableViewRow has tons of complex subviews, such as UIStackView, UICollectionView,…

By forcing reloadData(). It means UITableView will re-draw and calculate dynamic size of 100 cells again -> It hits your CPU and reduces performance as well. It’s painful.

I will more detail in Benchmark section to show you how it’s bad.

Solution 2: Do manually

Many people will come up with solution 2. Instead of calling reloadData(). They will calculate the difference (insert/delete/reload) manually and warp into tableView.performAction{}

Some things like this

Pros

  • By calling reload/insert/delete on particular cell. You reduces time to re-draw cell. It’s good point.
  • You can apply specific animation for each kind of transform easily.

Cons

  • By calculating the diff manually, it means by your hand and brain. It’s still a nightmare if you have a many new item, and the order of new item is completely difference than previous items.
  • If you have enough patient, and careful. You could do it, but who know your solution is the best optimization? Maybe there is someone out there who come up with another comparison, have less “transform” than your result? 🤔 Working manually is not optimize approach.
  • If you make a mistake, your app crash with some kind of error like this

It always happens you indexPaths doesn’t match with data source.

Solution 3: Diff Algorithm

How about LCS (Diff). Good news, we can reuse what we actually did on LCS Part 1. By using modified Diff algorithm, we can extract “transform” from memorization table. We can know which items are removed, added, or reloaded. All of messy tasks will compute automatically. You don’t need put more effort, or get nervous when app crashes anymore.

The final result.

Those codes are self-explanatory, sound interesting enough? 😍

Besides that, I will do benchmark by each solution on my old iPhone 5, give you overall perspective.

1. Diff Algorithm

1.1 Memorization table

If you’re not familiar with that memorization-table. I would like to recommend to read LCS-Diff Part 1 before going next.

Let me remind in case you’ve forgotten. LCS is one of classic computer science problems.

Given string A = ADFGT, string B = AFOXT. The LCS of A and B is AFT.

But we need more than LCS, we should need “how” A can transform into B. Fortunately, from the table above, we can “extract” transform easily too. Memorization-Table is a new King 😎

1.2 DiffTransform enum

To understand what’s is transform stand for, I will create new generic enum

We created DiffTransform enum for representing a “transform”. Please notice one thing, I prefer use Enum with Generic and associated-value than struct/class. Because if we come with Struct/Class, we also need to define another Enum to handle kind of Transform. It’s duplicated work.

With DiffTransform, we can store “kind of transform”, the index, and the real data at the same object, with less code as possible 😉

In few trivial programming language like Java or Objective-C, it’s impossible unfortunately. We must create class and enum, but from Swift 2.0, we can do it. It’s save me a lot of redundancy code. Less code, less bug. You know 🤘🏽

In case it’s first time you’ve heard it. Take a look at Generic Enum and Enum with associated-value

I also implement few helper to quick access data in future.

From previous LCS blog, we have 3 rules. I will remind in each step, also make a sightly modification.

We will traceback from right-bottom corner to left-top corner.

  • If A[i] == B[j] ->It means, A[i] is part of LCS. Then we move diagonal by i–, j–

T from string A convert to T from string B -> There is no difference, we just reload it.

The “transform” simple is

  • if A[i] != B[i], we have two sub-case

As you can see, G != X. It means, G needs to be removed from A.

F != X. It means, X need to be added.

  • i == 0 || j ==0

It’s two special case we should mention. Because we need to figure it out the transform, don’t miss this case

Summary

By following 4 cases, we can determine when transform is applied.

Here is final instruction.

1.3 Diff

This time to implement the container for storing all DiffTransforms which we calculated by hand 😉

A Diff is a simple struct. It contains array of DiffTransform. I prefer to use Generic all time as possible. I suppose it’s hard to read at the first time, but if you’re familiar enough, it’s quite simple. Generic is really flexible for container as well.

I defined Diff<T> and DiffTransform<T> is same Generic on purpose. So both of model store same kind of value.

Beside that, I also implement few helper methods, it helps us get insertion/deletion/reload quickly, and override + operation too.

1.4 Warp it up

Time to combine all separately things into one part. I’m still following same approach from Part 1, extend Array, generate MemorizationTable, recursive the table and calculate the DiffTransform.

Memorization table

Calculate DiffTransfrom, Diff

It’s sightly modification than the original LCS Algorithm. Instead of trace-back and print LCS string, we check small case and determine when “Transform” and warp into DiffTransform.

The heart of Diff is at diffFromMemorizationTable func. It’s recursion method, and base-recursion return Diff(), then adding DiffTransform from prior recursive func. The finally result will be an Diff instance, with result is all transforms.

Apply Diff

It’s place for magic happen. [A] transform to [B] by applying DiffTransform one by one. We start with Deletion Transform firstly, then Insertion Transform. We also able to apply ReloadTransform, but it this case, we’re working on String, we don’t need to reload it-self.

Test with String

Sound easy, right? 😉 The Diff is the optimize way as possible, and it does 100% automatically. Life is easy now 🤗

2. Integration with UITableView

We create DiffCalculator struct to handle all transform logic for UITableView. It stores tableview as weak variable (We don’t want to make cycle-retain right?), and data model.

The especially things is Setter of data. We calculate the DiffTransform for Insertion/Deletion/Reload, then performing the transform in <beginUpdate> and <endUpdate>

Using DiffCaculator is really simple.

3. Benchmark

For demonstration, I write a simple app with smallDataSource (10 rows) and reload with large DataSource (20 rows) (simulate it’s from API).

On my purpose, the size of cells are dynamic, and collectionView inside each cell and the quality of images are high. Because we want to see the benchmark clearly. So make it’s more complex.

The Sample app has 3 tabs, stand for each approach we’ve discussed: Force Reload, manually calculate, and Diff algorithm.

On this benchmark, I will test on my iPhone 5, gain accurate results, it will reflect efficiency for each solution.

3.1 Force reloadData()

By calling tableView.reloadData() is naive solution we could do. As you can see clearly, the first peak intends for the first time we reload and fill data into UITableView.

Noticeable, the second peak really impacts our performance. It costs 47%. Because the table view must calculate the dynamic size on runtime, get minimum size which matched cell’s constraints, it depends on length of comment string. And reload the collectionview built-in the cell.

Please bear in mind, I did it on my purpose. Because in a real world, the layout of cells are most complex. And of course, as more as complex we can see the result clearly.

3.2 Do manually

Here is sample data: The data source of Feed screen doesn’t order by created_at, it looks like random, depend on the “score” of each Item.

How could I do manually? 🤔

How could I get insertIndexPaths and deleteIndexPaths then passing into insertRows/DeleteRows 🤔

Really hard, right? 🤔

Yeah, if you’re struggling to solve it, there are no change to make sure your result is optimization. Or you might get mistake, and your app is crashed completely.

Diff algorithm is a rescue now 🤗

3.3 Diff algorithm

It’s the perfect time to reused DiffCalculator which we implemented in the beginning.

It’s completely straight-forward. Just initialize self.diffCalculator and set data with new data from API. DiffCalculator will do difficult part for us.

We only need 6 steps to transform. It’s optimization approach for now 🤗

The result doesn’t make us disappointed. From 47%, we reduce to 21%. Good point 💯

Summary

  • Get perfect transform between 2 data source, absolutely optimize than doing by hand manually
  • 100% calculate automatically.
  • Avoid crash unexpected.
  • Suitable for complex tableview/collectionview.

4. Drawback

Everything has two sides, it’s unfair if we only mention the bright side. Does Diff algorithm (based on LCS) have any disadvantage? 🤔

Let try on large set of data. The initial data has 100 item, and new data has 210 item, and do each step which same we did.

Wow, 56% CPU for Diff. What’s wrong here 🤔. As big as data source’s length could be, the time for Diff increases significantly.

The problem is time complexity of Diff is O(n * m). So we have 100*210 = 210,000 cell in memorization-table, and of course, cost more CPU to iterate through the table.

Diff algorithm which we discussed is based on LCS [Part 1].  So it’s not best algorithm we could implement. I choice this approach because it’s simple, and it’s relevant with Part 1.

5. What’s next?

In next chapter, we will discuss Paul Heckel’s Diff.

This algorithm finds all the possible update that UITableView needs in linear time O(n) . It’s most efficiently approach which widely used in applications. Moreover, IGListKit also uses it as back core as well. 🤗

6. Where to go from here?

Source Diff on iOS

Hope you enjoy my blog about LCS and Diff on iOS Development 👍

Diff algorithm and integration with UITableView which I cover, it’s not perfect optimize solution. It’s perfect place for beginner. There are various algorithms to improve the time complexity. One of them is Paul Heckel’s Diff (was published 1978).

If you would like to take a deep research, take a look at IGListKit by Instagram Engineer, and their blog.

If you need more challenge. Here is the puzzle:

 

 

3 thoughts on “Diff in iOS [Part 2]

  1. This is a great explanation of a complex, but very useful topic for iOS. I’m currently working on a grouped UITableView which is used to edit the properties of an object. There are four types for this object and different cells/sections are shown depending on the type selected by the user.

    Basically, I need to transition between different arrangements of cells within a grouped tableview. What are your thoughts on applying this LCS approach to tableview sections?

Leave a Reply

Your email address will not be published. Required fields are marked *