Simplify virtually executes an app to understand its behavior and then tries to optimize the code so that it behaves identically but is easier for a human to understand. Each optimization type is simple and generic, so it doesn't matter what the specific type of obfuscation is used.
The code on the left is a decompilation of an obfuscated app, and the code on the right has been deobfuscated.
There are three parts to the project: smalivm, simplify, and the demo app.
if
or switch
conditional with an unknown value results in both branches being taken.usage: java -jar simplify.jar <input> [options]
deobfuscates a dalvik executable
-et,--exclude-types <pattern> Exclude classes and methods which include REGEX, eg: "com/android", applied after include-types
-h,--help Display this message
-ie,--ignore-errors Ignore errors while executing and optimizing methods. This may lead to unexpected behavior.
--include-support Attempt to execute and optimize classes in Android support library packages, default: false
-it,--include-types <pattern> Limit execution to classes and methods which include REGEX, eg: ";->targetMethod\("
--max-address-visits <N> Give up executing a method after visiting the same address N times, limits loops, default: 10000
--max-call-depth <N> Do not call methods after reaching a call depth of N, limits recursion and long method chains, default: 50
--max-execution-time <N> Give up executing a method after N seconds, default: 300
--max-method-visits <N> Give up executing a method after executing N instructions in that method, default: 1000000
--max-passes <N> Do not run optimizers on a method more than N times, default: 100
-o,--output <file> Output simplified input to FILE
--output-api-level <LEVEL> Set output DEX API compatibility to LEVEL, default: 15
-q,--quiet Be quiet
--remove-weak Remove code even if there are weak side effects, default: true
-v,--verbose <LEVEL> Set verbosity to LEVEL, default: 0
Building requires the Java Development Kit 8 (JDK) to be installed.
Because this project contains submodules for Android frameworks, either clone with --recursive
:
git clone --recursive https://github.com/CalebFenton/simplify.git
Or update submodules at any time with:
git submodule update --init --recursive
Then, to build a single jar which contains all dependencies:
./gradlew fatjar
The Simplify jar will be in simplify/build/libs/
. You can test it's working by simplifying the provided obfuscated example app. Here's how you'd run it (you may need to change simplify.jar
):
java -jar simplify/build/libs/simplify.jar -it "org/cf/obfuscated" -et "MainActivity" simplify/obfuscated-app.apk
To understand what's getting deobfuscated, check out Obfuscated App's README.
If Simplify fails, try these recommendations, in order:
-it
option.--max-address-visits
, --max-call-depth
, and --max-method-visits
.-v
or -v 2
and report the issue with the logs and a hash of the DEX or APK.If building on Windows, and building fails with an error similar to:
Could not find tools.jar. Please check that C:\Program Files\Java\jre1.8.0_151 contains a valid JDK installation.
This means Gradle is unable to find a proper JDK path. Make sure the JDK is installed, set the JAVA_HOME
environment variable to your JDK path, and make sure to close and re-open the command prompt you use to build.
Don't be shy. I think virtual execution and deobfuscation are fascinating problems. Anyone who's interested is automatically cool and contributions are welcome, even if it's just to fix a typo. Feel free to ask questions in the issues and submit pull requests.
Please include a link to the APK or DEX and the full command you're using. This makes it much easier to reproduce (and thus fix) your issue.
If you can't share the sample, please include the file hash (SHA1, SHA256, etc).
If an op places a value of a type which can be turned into a constant such as a string, number, or boolean, this optimization will replace that op with the constant. For example:
const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
invoke-static {v0}, Lmy/string/Decryptor;->decrypt(Ljava/lang/String;)Ljava/lang/String;
# Decrypts to: "Tell me of your homeworld, Usul."
move-result v0
In this example, an encrypted string is decrypted and placed into v0
. Since strings are "constantizable", the move-result v0
can be replaced with a const-string
:
const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
invoke-static {v0}, Lmy/string/Decryptor;->decrypt(Ljava/lang/String;)Ljava/lang/String;
const-string v0, "Tell me of your homeworld, Usul."
Code is dead if removing it cannot possibly alter the behavior of the app. The most obvious case is if the code is unreachable, e.g. if (false) { // dead }
). If code is reachable, it may be considered dead if it doesn't affect any state outside of the method, i.e. it has no side effect. For example, code may not affect the return value for the method, alter any class variables, or perform any IO. This is a difficult to determine in static analysis. Luckily, smalivm doesn't have to be clever. It just stupidly executes everything it can and assumes there are side effects if it can't be sure. Consider the example from Constant Propagation:
const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
invoke-static {v0}, Lmy/string/Decryptor;->decrypt(Ljava/lang/String;)Ljava/lang/String;
const-string v0, "Tell me of your homeworld, Usul."
In this code, the invoke-static
no longer affects the return value of the method and let's assume it doesn't do anything weird like write bytes to the file system or a network socket so it has no side effects. It can simply be removed.
const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
const-string v0, "Tell me of your homeworld, Usul."
Finally, the first const-string
assigns a value to a register, but that value is never used, i.e. the assignment is dead. It can also be removed.
const-string v0, "Tell me of your homeworld, Usul."
Huzzah!
One major challenge with static analysis of Java is reflection. It's just not possible to know the arguments are for reflection methods without doing careful data flow analysis. There are smart, clever ways of doing this, but smalivm does it by just executing the code. When it finds a reflected method invocation such as:
invoke-virtual {v0, v1, v2}, Ljava/lang/reflect/Method;->invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
It can know the values of v0
, v1
, and v2
. If it's sure what the values are, it can replace the call to Method.invoke()
with an actual non-reflected method invocation. The same applies for reflected field and class lookups.
For everything that doesn't fit cleanly into a particular category, there's peephole optimizations. This includes removing useless check-cast
ops, replacing Ljava/lang/String;-><init>
calls with const-string
, and so on.
.method public static test1()I
.locals 2
new-instance v0, Ljava/lang/Integer;
const/4 v1, 0x1
invoke-direct {v0, v1}, Ljava/lang/Integer;-><init>(I)V
invoke-virtual {v0}, Ljava/lang/Integer;->intValue()I
move-result v0
return v0
.end method
All this does is v0 = 1
.
.method public static test1()I
.locals 2
new-instance v0, Ljava/lang/Integer;
const/4 v1, 0x1
invoke-direct {v0, v1}, Ljava/lang/Integer;-><init>(I)V
invoke-virtual {v0}, Ljava/lang/Integer;->intValue()I
const/4 v0, 0x1
return v0
.end method
The move-result v0
is replaced with const/4 v0, 0x1
. This is because there is only one possible return value for intValue()I
and the return type can be made a constant. The arguments v0
and v1
are unambiguous and do not change. That is to say, there's a consensus of values for every possible execution path at intValue()I
. Other types of values that can be turned into constants:
const/4
, const/16
, etc.const-string
const-class
.method public static test1()I
.locals 2
const/4 v0, 0x1
return v0
.end method
Because the code above const/4 v0, 0x1
does not affect state outside of the method (no side-effects), it can be removed without changing behavior. If there was a method call that wrote something to the file system or network, it couldn't be removed because it affects state outside the method. Or if test()I
took a mutable argument, such as a LinkedList
, any instructions that accessed it couldn't be considered dead.
Other examples of dead code:
if (false) { dead_code(); }
This tool is available under a dual license: a commercial one suitable for closed source projects and a GPL license that can be used in open source software.
Depending on your needs, you must choose one of them and follow its policies. A detail of the policies and agreements for each license type are available in the LICENSE.COMMERCIAL and LICENSE.GPL files.
题目: Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" click to show corner cases. Corner Cases: Did you consider t
题目: Given an absolute path for a file (Unix-style), simplify it. For example, path =”/home/”, =>”/home” path =”/a/./b/../../c/”, =>”/c” click to show corner cases. Corner Cases: Did you consider the c
Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" 这个题就是拼体力的,用一个string的stack,然后码就可以了。。。注意边界条件 class Solution { publ
题目: Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" click to show corner cases. Corner Cases: Did you consider t
https://github.com/lihy96/MeshSimplify 转载于:https://www.cnblogs.com/hshy/p/8985463.html
Java To EXE - Why, When, When Not And How --By Dmitry Leskov http://javapronews.com/javapronews-47-20050719JavatoEXEWhyWhenWhenNotandHow.html
题目链接:https://leetcode.com/problems/simplify-path/ 题目: Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" click to
题目 Given an absolute path for a file (Unix-style), simplify it. For example, path = “/home/”, => “/home” path = “/a/./b/../../c/”, => “/c” 思路 本题如果会用getline这个C++库函数,那么问题就简单了 代码 class Solution { public:
Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" Corner Cases: Did you consider the case where path = "/../"? In
需要好好想想~~~~~~ 字符串处理,由于".."是返回上级目录(如果是根目录则不处理),因此可以考虑用栈记录路径名,以便于处理。需要注意几个细节: 重复连续出现的'/',只按1个处理,即跳过重复连续出现的'/'; 如果路径名是".",则不处理; 如果路径名是"..",则需要弹栈,如果栈为空,则不做处理; 如果路径名为其他字符串,入栈。 最后,再逐个取出栈中元素(即已保存的路径名),用'/'分隔并
71. Simplify Path(Python3) 题目 Given an absolute path for a file (Unix-style), simplify it. For example, path = “/home/”, => “/home” path = “/a/./b/../../c/”, => “/c” click to show corner cases. Corner
http://www.cnblogs.com/TenosDoIt/p/3465328.html Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" Corner Cases
Given an absolute path for a file (Unix-style), simplify it. For example, path = "/home/", => "/home" path = "/a/./b/../../c/", => "/c" 题意:问最后的路径是什么。 思路:说白了就是字符串的处理。".."的话就pop一个,字母的话就是push class Solut