View Single Post
  #12  
Old 08-17-2004, 06:01
mihaliczaj
 
Posts: n/a
Quote:
Originally Posted by br00t_4_c
You can reconstruct source code from a disassembled binary executable that may well closely resemble the original source code but as Sarge very astutely mentioned variable and function names will be mangled, comments will be lost, etc.
Yes, that is the realistic view. These information are simply lost during compilation. Assuming there is no debug info, just the compiled, stripped .exe we can't do anything against this.
I am sure, however, that even such a source with names like variable1, variable2 etc. can be a great help for anyone who wants to understand the original ideas behind the code.
Don't forget that the other alternative is facing a huge, unorganized list of assembly functions.
Some information that I am sure can be (at least partly) recognized when the optimization doesn't hide it:
C++ specific:
- virtual tables
- ctors, dtors
- inheritance relationships
- dynamic_casts
- class sizes
- stack objects
- global objects
- member functions
- member pointers, member function pointers
- heap allocations
C specific:
- switch statements
- loops
- function calls
Assuming we have a tool that collects all these information and it is built into a debugger (OllyDbg for example), just imagine what help it could be.
OllyDbg supports writing comments next to the code. If this tool also supported naming of the recognized structures, complete parts of the original code could be reconstructed.
Quote:
Originally Posted by br00t_4_c
Maybe if there was a decompiler that incorporated some kick-ass artificial intelligence that could magically analyze and emulate the personality and proclivities of the developer who wrote the code
Creating utopias
If we had such AI the programs probably wouldn't be written by humans. Humans would just assist defining the target conditions.
Then the abstraction level would be more far from the assembly level, and that AI would still be not enough. But there would be recognizable patterns in the created code and a tool could be created to display them. A lot of info would be lost, but with some patience complete parts of the original code (or target conditions) could be reconstructed manually.
Back to the ground
As the coder is (probably) a person, just another person is smart enough to recognize his/her thoughts. The automatically recognizable patterns should be shown, these are the language elements (cycles, function calls etc.), but the rest should be left to the user. I know that there are some coding patterns that could be easily recognized, but the rev.engineer is who should recognize and mark them. The best tool doesn't do everything, but it does it in a reliable way you can build on.

If anyone were interested in writing the OllyDbg plugin contoured above, I would give further details on the possible recognization of the mentioned structures with pleasure.
Reply With Quote