protect from decompiler

How to protect my swf from decompiler ?
ericlin@ms1.hinet.net

The answer is "no way". At least no way to me. By proper tools, I can decompile any swf.

So, do not incorporate important information in the swf. Do not include your personal acount or password in the swf.

I will briefly discuss the history of "protection" technique and how they failed.

Then I would discuss how at best we can do. In Chinese old says: "A way able to protect from gentleman but not from professional theft."

Open-file-format

Before any discussion, we must know that, SWF is open format. Open-format means swf files are not exclusively produced by Flash. Other companies can create SWF that can be played on SWF player. Open-format means what informations get stored in what positioin are known for public. The meaning of each byte is known for public. So, if I have time to check the SWF byte by byte, I know everything.

Of-course, I wont have the time to check an SWF of 2 mb in size byte by byte. So, there are softwares to help me achieve the jobs. If that software meets trouble, OK, I will take over the job temporarily and check the bytes where troubles occurr. Fix it, and then continue. Nothing can hide. The limitation is my time and my patience.

If the reward of decompiling an swf is millions of dolloars, I surely would like to take years to read it byte by byte.

OK, here is the history of the war between decompiler and protection.

protect from import

Since the birth of Flash, Macromedia gives authors a function: "password-protected from import". If you protect that swf from import with a password, then that swf can not be imported. Some vector graphs in the swf can be imported to fla file if we dont protect it.

This protection serves nothing but false security.

Think about that, your swf is going to be played by the player of users. You can not protect from that player. So, how it protect the swf ? Well, the protection is in the Flash software you buy. Flash refuses to import if there is a password string in the swf. Non-sense, right ? I can use hex editor to open that swf and then delete that password string and then the protection is removed.

How easy it is ! So, forget about that protect function.

Convert to projector file and compression

If I convert it to a projector file in the format of exe, can the EXE file be decompiled ?

Yes. The swf is still there. There are softwares that can easily extract the swf out.

Compression may make the swf not readable by hex-editor. Is this a protection ?

The compression algorithm is similar to zip. It is easily decompressed.

Flasm and the p-code

Then in the era of Flash 5, two popular tools appear. The free "flasm" and the commercial "ASV 2.0"

Flasm is "Flash asm". It interpret the byte code in the swf into understandable short codes (p-codes).

For example, "a=3" is displayed as "push 'a', 3", "setVariable";

The byte-codes in the swf is "96 08 00 00 61 00 07 03 00 00 00 1D"

This is an in-valuable tool if we want to study the "structure of SWF format".

Programmers like to develop software by high level language, such as C, C++. But, when something needs efficience severly, they incorporate low level assembly codes in it. So, sometimes, authors adopt flasm to write low level p-codes to add efficiency.

So, flasm has a power to edit the action script in the swf. You can see examples how they use this optimization technique to improve the 3d codes.

However, malicious users can "edit" the swf. Any lock in the swf can be easily removed. We dont need a "key" to open the lock. We just remove the lock.

Here is a common and well known technique to protect our movie from stolen and shown on other domain. We script a check for _url. If the _url is not our domain, then we disable the functions and display a message "You are thief". However, it is easy to remove this script by flasm. To crack this protection, it takes not more than 1 minutes.

Actionscript Viewer and void (a)<=b>"c" || 0(!1 && !0)

ASV can extract symbols out, so the sound, shape and bitmaps can be stolen.

It also extract the action script byte codes. ASV 2 tries to match the p-codes to high level actionscript. When it meets "push 'a', 3", "setVariable"; it display "a=3" , a language the same as actionscript. However, we can easily crash it by creating codes that do not match a pattern.

The codes created by flasm are easily off the standard pattern, so ASV wont get a match.

The famous script that crashes ASV 2 is ";" This is a jung-codes. It does nothing but confuse the ASV 2.

However, when the protection scripts are well known, the author of ASV (Burakk) of-course wont let it go. The protection technique did not last long before ASV 3.

Booming of decompilers

Then the era of MX comes. Penetration rate of Flash dooms. Many decompilers appear.

ASV 4 is the present version. It displays not only matched actionscript. It also display p-codes if there is no matches. If it gets trouble to interprete p-codes, it displays the byte codes in the swf. It also display the offset in the swf file. This means that, it never "fails". It wont crash becasue at least it can display "byte codes" , the byte in the swf.

Even more, Flash MX2004 gives out Javascript API to create "fla" file. That make it the ability to create an "fla" file that can export to that swf. Everything is there now.

Let alone the sound, shape and bitmaps. Thief does not like these assets because it is so obvious to be caught. Thief likes to steal "action script". Because there are hidden password. Because there are scripts that block the normal playing of this movie. Because there are functions they can modify and use with less risk of being caught.

If ASV can only decompile the script into byte codes, then it is useless to most thief. So, many try their best to prevent ASV 4 to decompile the script into action script or p-code.

In fact, for most other decompiler, when the script fails to match patterns, the decomipler crashes.

Here are techniques in the history. The protection effect of each technique last only for a short while and expired soon after it is "published" in the internet and revealed to the decompiler group.

Chunk decompiling by the data size - the sentence

The success of most of the technique to confuse or crack the decompiler is because of the behavior difference between player and decompiler.

Player executes byte codes one by one. In real world, it is like reading a book, one word and then the next world. While decompiler usually chop the chain of byte codes into meaningful pieces. In real world, it is like reading a book, one sentence and then the next sentence.

The reason why decompiler behaves like this is simple. Most of p-code command are followd by size of data.

For byte-codes ("96 08 00 00 61 00 07 03 00 00 00 1D"), decompiler meets 0x96, which means "push". Push what ? The next two byte shows (0x0008). What get pushed is the thing stored in the next 8 bytes. ("00 61 00 07 03 00 00 00"). So, usually decompiler chops the short segment in a chunk by the data-size. This inevitably is interpretated as "push something". So, the sentence is ("96 08 00 00 61 00 07 03 00 00 00"). Period is here. The next byte is the begining of the next sentence.

What follows is 0x1D, which means "setVariable".

Well, the 8 bytes "something" will be parsed further to be a string 'a' and a number 3.

Lets see the byte codes:("99 02 00 05 00 96");

0x99 means branch. (or jump). Branch to where ? What follows is (0002), so the data is stored in the next 2 bytes. Chop it down. Anyway, we know "99 02 00 05 00" is a sentence. What follows is 0x96, that is the start of next sentence.

The third example: bytecodes:("88 08 00 03 00 63 00 62 00 61 00 96 07 00");

0x88 means define constants. What constants data are defined ? What follows is (0008), the constants data are stored in the next 8 bytes. So the sentence is ("88 08 00 03 00 63 00 62 00 61 00"). What follows is the start of the next sentence ("96 07 00 ......) - which is a "push with a data of 7 bytes;

So, the byte codes are chopped down into several sentence. Each sentence starts with a command and data. So, a sentence is a basic unit. Theoretically, nothing is wrong with such approach..

"Make player reads from the middle of the sentence".

OK, lets start to discuss the technique to crash decompiler. "Make player reads from the middle of the sentence".

First I would try to give an example in the real world.

John says good morning.

Mary says thank you.

Now lets create an swf:

skip 9 words

Tom says John says good morning. skip 3 words

back 7 words

Mary says thank you.

If you read it word by word, the result is the same as original one. But, decompiler reads it sentence by sentence. Several errors occurr.

First, it knows Tom says something, but the grammer is not correct. It will report "error".

Second, it does not see the second "skip" is a command because it is within the sentence.

Third, when it is forced back 7 words, it gets puzzled and it assumes that it should execute the whole sentence start from "Tom said".

Fourth, this error makes it looping infinitely betweeln the second and third line.

In summary, we add a junk codes "Tom says" and give a wrong data size about the length of this sentence. This wrong length covered the "skip" command.

Lets give real examples. Please note that, these technique need manipulate bytecode. It can not be done by pure action script.

example 1: forward jump with dead codes containing invalid data size

push True

branchifTrue label2

constants ''

label2:

push 'a',3

setVariable

OK, if you see carefully, the line "constants " " is a junk-codes. It will never be executed. But, theoretically, it should be executed if the result of the second line is "not True". So decompiler will try to decompile it.

Lets make the "sentence size" after "0x88 - constants" to include the bytes till the end of script. You know, decompiler will chop the bytecodes into 3 sentences like this:

push True

branchifTrue label2

constants label2: push 'a',3 setVariable

If you try to decompile the swf, some decompiler will crash due to the four errors I mentioned above. Some decompiler will survive but only display: " if(false){}; "

ASV 3 also failed to reveal the script. But ASV 4 succeeds.

To crack this swf, we remove the dead codes "constant xxxx", (0x88 and the following two bytes), then everything gets decompiled.

Here is the zip file explaining the detail how to create such protected file. forward.zip

The next example: backward jump with dead codes containing invalid data size

push 'b'

label1:

push 'a',3

setVariable

branch label2

branch label1

label2:

OK, the -push 'b'- is really a junk codes that we are going to modify to crash ASV 4.

Lets modify the "length of sentence" for -push 'b'-. Modify the 2 byte data after "0x96". Make the sentence length cover the bytes till before -branch label1-. Thus the decompiler will take the bytecodes as 3 sentences:

push label1: push 'a',3 setVariable branch label2

branch label1

label2:

Now, decompiler complains that it does not know what is pushed. Also, it interprets this to be an infinite loop between the first sentence and the second sentence.

This technique crashes most decompiler including flair. Flasm fails. ASV 4 fails.

To crack this swf, we manually remove the dead code "push b", (0x96 and the tw bytes followed), then everything get decompiled.

After this technique is known, burakk has fixed ASV 4 , and handle that dead codes correctly. The next version will survive.

Here is a zip file explaining the detail and example to create such protected file. backward.zip

The non-displayable chars and obfuscator

In addition to the technique of block the decompilation, we can make the decompiled result not easy to read. You may check the web sites about obfuscator.

Basically, it rename the variables and function name.

function -3(-4){trace(-4);}
function -1(0,-2){
if(0<-2){-3(1);}
}

Surely, theft could not just copy and paste these script for use. The compiler wont let you name a function like this.

The limitation of ofbuscator is that, changing of function name might cause some troubles in the script below.

function myFun(){trace("myFun");}
a="my";
b="Fun";
this[a+b]();

Another technique is make function name non-displayable. For example, if I name my function by Chinese characters, the decompiler might fail to display it well. Then we will see:

function ?(){?,?){?.?=?;}

ASV 4 will use unicode to display undisplayable characters. So, it is readable. It only add slight difficulties.

self made protection

If you find a good way to protect your swf from decompiler, dont share it with others. At least dont publish it on the internet.

Of-course, it is impossible to protect 100%, and at least it is not protected from me. However, not every one knows SWF format very well.

Many theft can only decompile the swf by software available. So, if your goal is to keep as less theft as possible, you keep your method secret.

Just remind you once again. Your swf is naked. For one who knows SWF format, all are revealed. If your goal is just to protect your "some functions you figured out", it should be safe. They are unlikely to steal your function, they like to write their own.

Common protection:

We creates a game on our website for online playing. Unfortunately, many visited our site only once and then keep a downloaded version off line to play. Sometimes we even see our game appear on other's domain.

To avoid this, many tried ways of protection.

1. Check domain.

We write a script to check _url. If _url is "http://www.myDomain.net/game.swf" then the movie plays otherwise stops or quits. When it is played off line, the _url should be something like "file://C| someSub/game.swf", when it is put on other's domain, the _url should be "http://www.others.net/game.swf". So, this technique correctly add some protection.

Of-course, not to malicous decompiler. Those script can be easily removed or changed by flasm to disable the check. Although it is unlikely to see a cracked swf on others domain publicly, it may be passed as a off-line playable swf.

2. Server Password:

We write a script, when the game plays, it load a password from the server. If it is null, then the game stops or quit.

It is easily cracked by the malicious user who edits the swf and removes those script. What scripts can not be removed ?

When our game starts to play, it loads from the server a map data which is essential to the game. Well, the malicious user can not remove the script. He must supply the map data.

Of-course, he can pick the map data from the casche in the temporary directory to and supply to the swf to activate the game.

3. Hold the swf or variables in the server.

It is an extention of the second technique. This technique is widely used.

Initially, the game.swf is only a loader. When play button is clicked, another swf is loaded in. When a map is needed, it loads map data from the server. When it meet an obstacle, the obstacle swf is loaded from the server again. The data of new level is also sent from the server.

Here we see the principle: the best way to protect from decompiling is "do not give".

If some stupid theft downloads only the game.swf, he can not play the game. He would need to pick all the swfs and variables in the cache. Open all the SWFs, edit the variable name to conform to the variable name in the casche.

If our maps is randomely generated by cgi, the theft might have only one copy of map. He will not have the power to generated maps randomyly. If that is a maze game, then at best, he can only play one puzzle. It lost the fantastic "dynamically creating maze" funciton.

If the malicious user plays the game but meets a new obstacle, that game might fail because he does not have the new obstacle swf in his casche.

So, many algorithm and functions are kept in server side. The SWF is nothing by an interface. Perfect protection. The pitfall is that, this turns out to be a game of CGI not flash. We are discussing about the protection of SWF. The solution is not fair, because it is not protecting swf itself.

by ericlin