1st of all i want to state that my question may be invalid in sense that the documents i am asking for doesnt exists at all. SO please correct me if you think so .
Can someone point to some documents on converting assembly from stack based architectures to register based architecture . Currently i am working on vmp vm which converts the x86 machine to stack based machines by replacing register locations intermediate stack locations . I have studied compiler documents on stack machines but so far i havent found any documents which guides me to convert the machine structure from one to another.
Biggest issue i am facing is loss of intermediate register data . what i mean is right side registers are always converted into stack locations and then dont always link together. If we have something like this
Code:
mov ebx,eax
mov ebx,ecx
it will transform in to
Code:
load [sr1],[mem1]
store [mem2],[sr1]
load [sr1],[mem3]
store [mem4],[sr1]
where
Code:
sr = stack register
mem = stack memory , i.e. scratch memory in stack
mem1 = mapped as eax
mem2 = intermediate memory which was suppose to be ebx
mem3 = mapped as ecx
mem4 = mapped in output as ebx
if you look carefully you will figure out that the mem2 cant be deducted . now some of you argue that if we do dead store elimination we wont need to analysis the 1st one. problem is its a very simple example . codes like mentioned below poses a huge problem to me for now .
Code:
mov eax,ebx
mov ecx,eax
mov edx,ecx
all the intermediate registers cant be deducted as the ultimate final stack mem -> register mapping is based on mapping only selected memory to registers and discarding the rest of the scratch memory . Example is as follows
Code:
//sample code
MOV EAX,EBX
MOV ECX,EAX
MOV EDX,ECX
MOV EDX,EBP
MOV EAX,0x539
MOV EAX,EDX
//transformed stack based machine code
loc=00000030 (EBX) -> [sr1]
[sr1] -> loc=00000000 (missing)
loc=00000000 -> [sr1]
[sr1] -> loc=00000004(ECX)
loc=00000004(ECX) -> [sr1]
[sr1] -> loc=00000008 (missing)
loc=00000038(EBP) -> [sr1]
const 539 -> [sr2]
[sr2] -> loc=00000020(missing)
[sr1] -> loc=0000003C(EDX)
loc=0000003C(EDX) -> [sr1]
[sr1] -> loc=0000001C(EAX)
loc = scratch memory in stack , sr1/sr2 stack registers . the registers in bracket is deducted from final transformation back to register machine in vmp return handler.
You will see several intermediate stack locations cant be deducted and it is not safe to allocate any register to them randomly as this may corrupt the assembly .
i am looking for expert advice in this area(specially people with compiler design knowledge) about how to map registers and what kind of knowledge do i need to solve this.