Android / Class Loader / Facebook / LinearAlloc / Mobile

Facebook Patch for Dalvik

If you can solve a problem by re-engineering your app or modifying Android, why would you modify Android?  That’s the burning question at the heart of this blog entry.

Picard to Warf: "Android 2.3 was written that way for a reason, Mr. Warf." (credit: appsgeyser.com)

Picard to Warf: “Android 2.3 was written that way for a reason, Mr. Warf.” (photo credit: CBS/startrek.com; blog source: Paramount)

Engineering mobile applications is embedded engineering.  It’s the art of using less code to accomplish something rather than more code.  And in this case, it’s also the art of avoiding a change that risks impacting every Android user that uses Android 2.3, and every carrier with Android 2.3 phones.

Recently, David Reiss, a Facebook engineer, posted on the Facebook blog information about a Dalvik “patch” that they were working on.  Interesting. My Android Spidey Sense tells me that the problem is more of an application issue, as opposed to an issue within Android OS itself.  So, I decided to find out with some quick analysis.

I’ve been working on Dalvik the past year, so you could say I’m a Dalvik “virtual machinist.”  This engineering problem is pretty juicy, and I couldn’t pass up looking at it.  Here’s the Facebook blog posting, and a quick summary of the issue David is seeing.

Under the Hood: Dalvik patch for Facebook for Android

The suggested fix does not consider that if Android 2.3 is updated from 5MB to 8MB, it may force all mobile phones using Android 2.3 to go through a round of regression testing.  That’s a lot of work for the carriers.  Depending on the hardware configuration on Android phones that run 2.3, the potential side effects far outweigh a more straightforward solution: Facebook simply re-engineers its application.  But that aside …

The issue that is reported by Facebook and others is not a “bug” in the sense that dexopt is crashing.  Nor is it the case that the failure is created by “deep” interface hierarchies, as the Google bug suggests.  In this case, dexopt is simply complaining that the application that Android is trying to load is too big: the code size is too big for the class loader to handle.  This kind of limit is nothing new to Android or any other mobile app.  It’s been there ever since mobile phones could run applications.  So, it could be said that this is an application level problem, as opposed to an OS problem.  This could be resolved using two approaches:

  1. re-engineer the application, especially in cases where you are trying to port code from one form (Javascript) to another (Java).
  2. modify the operating system to utilize more resources – which will affect all applications

For reasons I explain below, using #2 is a slippery slope.  It’s important to remember that 5MB is a lot of memory to  chew on for a device that was designed primarily to answer phone calls. And, as you’ll see, it’s not just 5MB versus 8MB anyway. The difference is potentially 10MB versus 16MB.  And, that my coder roadies, is a whole other ballgame.

Other Factors

The Facebook blog mentions that this limit was bumped into because code was ported from Javascript to Java.  So, this occurred when Facebook decided to add more features to their native Android application.

The Quick Conclusion/Upshot

The basic problem being reported is that when the Android phone loads your program to run, there are limits placed on the amount of code you can have in your application.  That’s a good thing.  If you exceed your limits, it’s much wiser to re-think your application design as opposed to tweaking the OS.

First of all, let me dis-abuse you of the notion that if dexopt fails in this way, it must be a “bug” in the Android platform.  Android was pretty thoroughly tested, as far as mobile operating systems go, so it’s just important to consider that your application design might just need to be re-worked.  In this case, while it appears that a Google engineer has signed on to address this “issue”, it doesn’t mean that Google thinks it’s a bug either.

The limit that is relevant here is a 5MB limit placed on all Android applications for Android 2.3.  These kinds of limits are there for a reason.  This is especially important to respect in the case of Android 2.3 because devices that run that version of the OS tend to be more limited on resources than Android 4.0 devices.

Bumping up the amount of memory that LinearAlloc uses will increase the amount of memory that ALL android applications consume on the outset when loading your classes. Each application that starts will have this amount of memory allocated to the class loader.

So, if you have 20 apps on your phone, each of those apps are going to allocate that amount of memory (5MB on Android 2.3, and 8MB on Android 4.0).  This is very important to consider because Android allows any of its applications to start background services.  For planning purposes, you must take into account the fact that every one of the apps on your phone may be running simultaneously because they may be running services.

More Details and Analysis

Here are the basic working of how LinearAllocHdr (LinearAlloc) is managed by Android:

LinearAllocHdr is a data structure in Android that so far appears to be used only by the class loader.  But, nothing says it can’t be used for other things in the future.  Here’s the structure:

28/*
29 * Linear allocation state.  We could tuck this into the start of the
30 * allocated region, but that would prevent us from sharing the rest of
31 * that first page.
32 */
33typedef struct LinearAllocHdr {
34    int     curOffset;          /* offset where next data goes */
35    pthread_mutex_t lock;       /* controls updates to this struct */
36
37    char*   mapAddr;            /* start of mmap()ed region */
38    int     mapLength;          /* length of region */
39    int     firstOffset;        /* for chasing through */
40
41    short*  writeRefCount;      /* for ENFORCE_READ_ONLY */
42} LinearAllocHdr;

mapAddr points to the block of memory that is allocated.

This structure is instantiated by the function dvmLinearAllocCreate.   The 5MB that David’s post talks about is actually a 5MB file that gets memory mapped. The length of the file is defined by DEFAULT_MAX_LENGTH:

/* default length of memory segment (worst case is probably "dexopt") */
72 #define DEFAULT_MAX_LENGTH  (5*1024*1024)

Memory mapping files is commonly done in Android.  In fact Dalvik maps your entire application code into memory this way.  This means two things for the problem at hand:

  1. 5MB of disk space is used to store the underlying data, and
  2. 5MB of memory is taken up to store the file’s contents in active memory.

So, the impact on the operating system is actually 10MB (worst case). So, we’re not just talking about 5MB here, we’re talking about potentially using twice that.  If you increase that to 8MB, you’re impacting OS with potentially a 16MB memory allocation.  Now, we’re getting into some serious memory for a mobile device to manage.  Remember, it’s an embedded system on a slow processor – especially in the case of Android 2.3 and OMAP.

dvmLinearAllocCreate is called by dvmClassStartup (Class.c), so that confirms that the only place that this is used is by the class loader (for now).  But, this is a very critical part of memory. The more memory used by the class loader, the more overhead you cause for Linux, and the slower your applications might boot up.  Again, perhaps not noticeable on Android 4.0, but it might be noticed on an Android 2.3 device – especially a cheap one that uses lower end hardware.  Especially when that is applied to all applications that run on the device.

7 thoughts on “Facebook Patch for Dalvik

  1. My app was great until I needed to add Google Play Services because of a third party library needed because of business rules. Not my coding.

  2. I know nothing about Dalvik however I agree with Richard that it would better to redesign your application to use less memory than to patch dalvik to solve your problem if possible. mmap() maps files to virtual memory of your application process, it will not cost anything until your process reference a virtual page, if the virtual page is not in RAM, a page fault will be raised and the Kernel will read the page into RAM ( page in). Regarding this particular Facebook 5MB issue, their code will be loaded into RAM then will use more than 5MB at the depends of other applications. I don’t know how much RAM Facebook need to run their applications, however If RAM is not big enough ( 2.3 was shipped with 256MB RAM device) to cope with bigger applications code, applications will not crash because of higher page fault rate ( mmap() does not use swap space) but they will perform slower due to paging rate ( page-in, page stolen). Kernel read ahead for reading pages (files) into RAM has to be reconsidered. Not to mention that a few applications can crash for any other reasons if they were not tested in a heavier RAM constraint environment. In short, tuning and regression tests should be done for all the applications that could run concurrently.

    All in all, there is no point to favor Facebook over other applications. If they want to port applications to 2.3, they should downgrade their applications, remove a few features, optimize their codes instead of patching the underlying software if not necessary. I believe it is always possible to optimize applications to use less resource however most developers don’t want to be bothered by resource limitations.

    • “most developers don’t want to be bothered by resource limitations”

      Bingo!

      It seems that “hobbiest” mobile developers try to ignore resource limitations. The professionals can work them to their advantage.

      Often times Google Play store user comments speak for themselves.

  3. The 5MB mmap doesn’t cause any disk space to be allocated; it’s an anonymous in-memory mapping only. The initial mmap() only causes a virtual page map entry to be created. Physical pages aren’t allocated until the memory is touched, and Android devices are configured without swap. So the initial allocation is nearly free.

    Allocations made during the zygote class pre-loading can actually be shared between multiple processes — part of the motivation for LinearAlloc was to collect all of the VM-generated class data (like vtables) that are generated once and never touched, so that they could be shared easily between app processes. (It can also be flagged read-only to help debug memory trashers.)

    So, 5MB isn’t quite as bad as it first appears.

    The trick is to find a limit that allows all reasonable apps but prevents truly unreasonable behavior, i.e. something that could allow an app to chew up half of physical memory. 5MB turned out to be too small, which is why the limit was bumped up to 8MB in a later release, and currently sits at 16MB.

And now it's your turn ... comment here or on Twitter (@Androider)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s