Fraud-Proofing an Android App: Choosing the Best Device ID for Promo Abuse Prevention
Last updated
Last updated
Company
General Terms and Conditions⚡Key takeaways:
Promo abuse is a type of fraud where bad actors take advantage of a business’s sign-up bonuses, referrals, coupons, or promotions.
MediaDRM should be preferred over device fingerprinting whenever possible.
Ultimately, our research suggests the best device ID suitable for blocklisting is a combination of MediaDRM+device model.
Keep in mind that your specific scenario may require a different approach. Drop a message at info@talsec.app, and Talsec security experts will help you.
Recently, we’ve faced a challenge in mobile device identification — how to identify and blocklist a fraudulent device without compromising user privacy. This issue holds particular significance for mobile application owners who frequently contend with users evading payments or exploiting various bonuses.
While it’s common to attract users with enticing bonuses during their initial sign-up, it’s crucial to recognize that this strategy comes with inherent risks and is prone to abuse. These malicious users go to the extent of reinstalling the app multiple times, attempting to gain sign-up bonuses repeatedly — a behavior we term “multi-instancing.”
Since Android alters some device IDs for each new app instance, pinpointing the same bad actor’s device becomes challenging for blocklisting. This underscores the complexity of our pursuit for an effective solution.
In the past, you could identify a device by its MAC address or IMEI without requesting special permissions. Today, after many Android privacy-oriented changes, several semi-persistent IDs are available on Android devices, such as AndroidID, MediaDRM, GSF ID, FID, and InstanceID. Of course, asking users for any permissions with elevated access and potential security implications is unrealistic.
Each of the IDs has its ups and downs, making them useful in different scenarios; see the example table below. You can find additional information about the IDs in the Android documentation.
Alternatively, the ID can be constructed by various device fingerprint libraries (e.g., fingerprintjs-android) based on multiple device IDs, device states, OS fingerprints, or installed apps.
Let’s look closer at these IDs.
While these identification methods may serve well in other contexts, in our scenario, they lack the resilience needed to withstand fraudulent multi-instancing. These IDs are relatively easy to change, so only a quick explanation about downsides of each respective one:
AndroidID changes in case of repackaging or if installed under another user on device
GSF ID (Google Play Service Framework ID) is restricted to Google devices only and can be relatively easily spoofed by XPrivacyLua. It also changes for different users.
FID (Firebase installation ID) won’t survive reinstallation
InstanceID (GUID, UUID.randomUUID().toString()
) is custom generated and internally stored ID, but it won’t survive reinstallation
Google Advertising ID is not suitable at all
In summary, none of those IDs is solid enough to identify bad actors in our scenario.
Notice: In the whole analysis, we worked with the fingerprintjs-android library that uses a comprehensive list of signals for stateless offline fingerprinting. The results of other fingerprinting libraries may be different. They may include more signals and heuristics based on their data insights, geolocation, IP location, TEE, and possibly other magic. Talsec collects fingerprintjs-android V3 & StabilityLevel.OPTIMAL. Differences between stability levels (STABLE — OPTIMAL — UNIQUE) can be found here.
Glance over the table above once again. At first sight, the Hardware Fingerprint (see the table above) could be the best option — it survives anything except for the Instant App event. However, there is one serious drawback to this ID — the collisions. Collisions are caused by the way the ID is calculated — as the name suggests, the ID is solely based on the device’s hardware. Imagine all the Samsung Galaxy Z Flips coming straight out of the assembly line. All of them will have the same ID in case of hardware-based fingerprint. This type of fingerprint is called a STABLE fingerprint.
On the other end, there is a UNIQUE fingerprint, which uses a considerable amount of signals to calculate the device’s ID. IDs created in this way have a high probability of matching only to one user (minimal number of collisions), but because of the high amount of signals, the ID changes quite often (e.g., with some change in settings), making one user have many IDs (making it useless in our use case). Example: ID may change if there is a change in installed apps or Data Roaming is enabled/disabled.
A third type of fingerprint is a compromise between the STABLE and UNIQUE fingerprints — an OPTIMAL fingerprint. It is less stable but more unique than the STABLE one and is collected by the Talsec SDK. But even this type of fingerprint is not as optimal as it sounds — as shown later in this article. Example: ID may change if the user switches between 12 and 24-hour clock representation or Development Settings or ADB is enabled/disabled.
Fingerprint example: `f37fc958dc6d566a8f4bf1e0fd25b510`
MediaDrm is an Android API enabling the secure provision of encryption keys to MediaCodec for premium content playback. It uses DRM providers like Google’s Widevine and Microsoft’s PlayReady. During initial DRM use, device provisioning obtains a unique certificate stored in the device’s DRM service.
Not only is the MediaDRM provided by this API the same for all users on the device, but in our scenario, it’s harder to spoof and can survive many attacks. No permissions are required to get this ID.
Yet, it still has limitations. It may be missing on devices that don’t support MediaDrm. Also, MediaDRM seems to have quite a few collisions with devices of the same manufacturer, as we show further.
MediaDRM example: `e3af1aa4dacb6b6637846488b511e7643c6ac20b65c95baad164b122ecb036b6`
We tested five devices and emulators under multiple multi-instancing scenarios and checked whether the IDs changed or remained the same. The most interesting ones are MediaDRM and Fingerprint (V3 & V5 Optimal), so we especially paid attention to these.
Multi-instancing scenarios:
First install of the app
Reinstall of app
Install in Work Profile
Make a clone of the app using Island App
Make a clone of the app using Parallel Space
Make a clone of the app using Parallel Apps
Make a clone of the app using Second Space (Xiaomi)
Installation in Guest Profile
Factory reset
Android emulator vs. actual device
This tedious work was made easier thanks to a great Fingerprint OSS Demo tool.
Here are the most important observations. Not all tests could always be performed, so we did all the low-hanging fruit ones — essentially, those attackers would attempt also.
Remained the same (= good):
MediaDRM remained the same for the First install, and Island App on the OnePlus 8 Pro
MediaDRM remained the same for the First install and Second Space on Redmi Note 10 Pro
MediaDRM remained the same after the Factory reset on the OnePlus 8 Pro
MediaDRM remained the same for First install, Work Profile, and Multiple Users on the OnePlus 8T
Fingerprint V5 Optimal remained the same for the First install and Parallel Space on the OnePlus 8T
MediaDRM and Fingerprint V3 & V5 Optimal remained the same after Reinstall on Redmi Note 10 Pro
Fingerprint V5 Optimal remained the same after reinstalling for First install, Work Profile, and Parallel Space on OnePlus 8T
Changed (= bad):
Fingerprint V5 Optimal changed for First install and Island App
Fingerprint V5 Optimal changed after a Factory reset on the OnePlus 8 Pro
Fingerprint V5 Optimal changed in Second Space on Redmi Note 10 Pro
MediaDRM changed in Parallel Space on the OnePlus 8T
Fingerprint V5 Optimal changed in Work Profile and Guest User on OnePlus 8T
Other:
FingerprintV3 Optimal was better than Fingerprint V5 Optimal for the emulator
MediaDRM was different on Emulator 1 and Emulator 2 (both on the same Windows machine)
Based on the observations, Fingerprint V3 and V5 Optimal failed in many multi-instance fraud scenarios compared to MediaDRM. From these tests, we can conclude that MediaDRM is the better one.
To quantify the results, we took our data, evaluated the behavior of those IDs, and came up with the most suitable ID based on the data. Remember that our data may be skewed and not representative compared to the devices of your user base.
We took two weeks of our freeRASP data and analyzed them. As the period is relatively short, we assume there are only so many reinstallations. Again, beware that we deliberately chose this window without any research regarding the real reinstall rate, which may differ based on the category/use case of the specific application.
Below, you can see the number of unique values for each ID and the number of unique device models captured in this data.
AndroidID: 13 402 601
FingerprintV3: 22 525 265
MediaDRM: 13 285 081
InstanceId: 13 740 706
Distinct device models: 14 175 (i.e., Pixel 4, SM-G973N, ONEPLUS A5000, LG-H930, …)
At first glance, we noticed the number of FingerprintV3, which is much higher than the numbers of the other IDs. This could be caused by the behavior of the FingerprintV3, which changes whenever users change some of the 32 observed fingerprinting signals.
After that, we looked at the co-occurrence of the IDs to see the relation between them.
How to read the table below: One AndroidID has 1.00557 unique MediaDRMs, and 0.54% of unique AndroidIDs have more than one MediaDRMs.
Based on the data, we can’t say what a “different device” is (as we are still looking for the “best” identifier).
Let’s look at how many models per ID there are on average. Remember that it is desirable for us to have the least number of collisions (distinct devices with the same ID) — we expect the IDs to have only one associated model. A quick glance over the table below will tell us there truly are some discrepancies.
Based on the data analysis, we can state the following:
FingerprintV3 has too many values compared to other IDs, making it less useful in our scenario.
One AndroidID/MediaDRM/InstanceID usually has more FingerprintV3s.
AndroidId and MediaDRM are roughly 1:1; some MediaDRM instances have multiple AndroidIDs (more than AndroidId has MediaDRMs).
In some cases, one AndroidID has multiple InstanceIDs (more often than MediaDRMs), similar to the relationship between MediaDRM and InstanceID.
InstanceID is more bound to AndroidID than MediaDRM.
AndroidID has only one model (with a tiny amount of outliers).
MediaDRM usually has one model, but there are a few collisions (more than in the case of AndroidID).
InstanceId falls somewhere in between the models.
Overall, the best of these identifiers seems to be AndroidID, followed by MediaDRM. InstanceId can also be helpful, but less so than AndroidID. FingerprintV3 is useless in our scenario. Since AndroidID changes after the reinstallation and is relatively easy to spoof, MediaDRM seems to be the most suitable for fraud detection.
However, MediaDRM seems to have quite a few collisions (based on our analysis of the models). We have discovered that the collisions occur most often with devices of the same manufacturer (i.e., devices of the same manufacturer are much more likely to have the same MediaDRM than devices of different manufacturers). Here are some numbers for you to get an overview:
0.005% of MediaDRMs have more than one manufacturer
0.55% of MediaDRMs have more than one model of the same manufacturer
The mean number of manufacturers per MediaDRM: 1.000085
The mean number of models of one manufacturer per MediaDRM: 1.006362
After many attempts (that we won’t elaborate here), we have tried experimenting with a combination of MediaDRM+model as a potential ID.
Example of combined MediaDRM+model of some Google Pixel 4: e3af1aa4dacb6b6637846488b511e7643c6ac20b65c95baad164b122ecb036b6+Pixel 4
Below is the same table of co-occurrences as above, now containing relations with MediaDRM+model:
We can see that the MediaDRM+model behaves better than the original MediaDRM — each MediaDRM+model has lower number of other IDs associated with it, meaning that we have avoided a few collisions (even though we cannot quantify that amount exactly, but the minimum bound is estimated by the MediaDRM — MediaDRM+model numbers).
While making the ID as a combination of two characteristics, we might run into an issue with one device having more IDs. However, this should not be the case with the MediaDRM+model, as one device should have only one model associated with it (i.e., one physical unit of Google Pixel 4 phone should always have only and only model name “Pixel 4”).
Therefore based on the data, we suggest using a simple combination of MediaDRM and device model as an ID for the examined case of fraud detection.
We’re tackling a challenge in mobile device identification — how to block fraudster devices without compromising user privacy. Mobile app owners face issues with users evading payments and exploiting bonuses through multi-instancing — reinstalling the app multiple times. Unreliable device IDs make it tough to identify persistent bad actors. Our findings recommend prioritizing MediaDRM over device fingerprinting (or even better combination of MediaDRM and device model) for effective blocklisting. Don’t forget to enhance security with layers like AppiCrypt, RASP, and KYC solutions. Every scenario is unique, so for tailored guidance, contact Talsec security experts at info@talsec.app.
Written by Dáša Pawlasová, Matúš Šikyňa, and Tomáš Soukal