Exetools

Exetools (https://forum.exetools.com/index.php)
-   General Discussion (https://forum.exetools.com/forumdisplay.php?f=2)
-   -   Stack Machine to Register Machine (https://forum.exetools.com/showthread.php?t=16583)

Conquest 03-01-2015 01:50

Stack Machine to Register Machine
 
1st of all i want to state that my question may be invalid in sense that the documents i am asking for doesnt exists at all. SO please correct me if you think so .
Can someone point to some documents on converting assembly from stack based architectures to register based architecture . Currently i am working on vmp vm which converts the x86 machine to stack based machines by replacing register locations intermediate stack locations . I have studied compiler documents on stack machines but so far i havent found any documents which guides me to convert the machine structure from one to another.

Biggest issue i am facing is loss of intermediate register data . what i mean is right side registers are always converted into stack locations and then dont always link together. If we have something like this

Code:

mov ebx,eax
mov ebx,ecx

it will transform in to
Code:

load [sr1],[mem1]
store [mem2],[sr1]
load [sr1],[mem3]
store [mem4],[sr1]

where
Code:

sr = stack register
mem = stack memory , i.e. scratch memory in stack
mem1 = mapped as eax
mem2 = intermediate memory which was suppose to be ebx
mem3 = mapped as ecx
mem4 = mapped in output as ebx

if you look carefully you will figure out that the mem2 cant be deducted . now some of you argue that if we do dead store elimination we wont need to analysis the 1st one. problem is its a very simple example . codes like mentioned below poses a huge problem to me for now .
Code:

mov eax,ebx
mov ecx,eax
mov edx,ecx

all the intermediate registers cant be deducted as the ultimate final stack mem -> register mapping is based on mapping only selected memory to registers and discarding the rest of the scratch memory . Example is as follows
Code:

//sample code
MOV EAX,EBX
MOV ECX,EAX
MOV EDX,ECX
MOV EDX,EBP
MOV EAX,0x539
MOV EAX,EDX
//transformed stack based machine code
loc=00000030 (EBX)                -> [sr1]
[sr1]                                -> loc=00000000 (missing)
loc=00000000                        -> [sr1]
[sr1]                                -> loc=00000004(ECX)
loc=00000004(ECX)                -> [sr1]
[sr1]                                -> loc=00000008 (missing)
loc=00000038(EBP)                -> [sr1]
const 539                        -> [sr2]
[sr2]                                -> loc=00000020(missing)
[sr1]                                -> loc=0000003C(EDX)
loc=0000003C(EDX)                -> [sr1]
[sr1]                                -> loc=0000001C(EAX)

loc = scratch memory in stack , sr1/sr2 stack registers . the registers in bracket is deducted from final transformation back to register machine in vmp return handler.
You will see several intermediate stack locations cant be deducted and it is not safe to allocate any register to them randomly as this may corrupt the assembly .
i am looking for expert advice in this area(specially people with compiler design knowledge) about how to map registers and what kind of knowledge do i need to solve this.

mcp 03-01-2015 21:11

Not sure I really understand your question.
It seems you're asking on how to reconstruct the original register based instructions? That is not possible, as that information is destroyed.

For example, given that stack based VM, you cannot distinguish

Code:

mov eax,ebx
mov ecx,eax
mov edx,ecx

from

Code:

mov ebx,eax
mov edx,ebx
mov ecx,edx

What you can do however, is what in compiler construction is called "register allocation". It basically means, that you start with arbitrarily many variables (in this case your stack variables from the stack machine) and find an allocation of assigning these variables to registers while at the same time trying to minimize the amount of register spills. Even a greedy algorithm should work sufficiently well in that case.
OTOT, for what reason do you actually want to dos this anyway? Re-assemble VM code?

Conquest 03-01-2015 21:44

Quote:

Originally Posted by mcp (Post 97995)
OTOT, for what reason do you actually want to dos this anyway? Re-assemble VM code?

Yes . Also you got my question correctly. weird enough in themida cisc vm they preserve the register information while vmp completely wipes it (unless they are using some hidden tricks like water marking the register handlers or assign the vm pcode with certain algo which encrypts the register information) , i cant find the register information at all.
Thanks for your info anyway. I will look forward to have advice from more people involved in this area.

mcp 03-07-2015 19:49

Also see this discussion on hackernews on stack vs register machines and the corresponding article.

Conquest 03-07-2015 22:48

Quote:

Originally Posted by mcp (Post 98136)
Also see this discussion on hackernews on stack vs register machines and the corresponding article.

This . This really helped me kind sir. I knew that Java and c# IL code is stack based machine but what i overlooked is the fact that they run almost in native speed on x86. This has led me to believe that somewhere there is an convertion from stack to register based machines. I will look into mono source next. Hopefully it is as efficient as the windows c# JIT compiler (which will probably reveal some more useful information regarding this). This is my 1st time writing a compiler so the roads ahead will probably be bumpy. Thanks for your excellent tips and helps . I will look forward to more useful information. As more information will come to my hand , i will update the thread.

PS:couldnt give a thanks for the help for not being a family member.

0xd4d 03-09-2015 06:17

Also see MS' JIT compiler which is now open source: https://github.com/dotnet/coreclr


All times are GMT +8. The time now is 05:26.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2026, vBulletin Solutions, Inc.
Always Your Best Friend: Aaron, JMI, ahmadmansoor, ZeNiX