2 * Copyright (c) 1996 Barton P. Miller
4 * We provide the Paradyn Parallel Performance Tools (below
5 * described as Paradyn") on an AS IS basis, and do not warrant its
6 * validity or performance. We reserve the right to update, modify,
7 * or discontinue this software at any time. We shall have no
8 * obligation to supply such updates or modifications or any other
9 * form of support to you.
11 * This license is for research uses. For such uses, there is no
12 * charge. We define "research use" to mean you may freely use it
13 * inside your organization for whatever purposes you see fit. But you
14 * may not re-distribute Paradyn or parts of Paradyn, in any form
15 * source or binary (including derivatives), electronic or otherwise,
16 * to any other organization or entity without our permission.
18 * (for other uses, please contact us at paradyn@cs.wisc.edu)
20 * All warranties, including without limitation, any warranty of
21 * merchantability or fitness for a particular purpose, are hereby
24 * By your use of Paradyn, you understand and agree that we (or any
25 * other person or entity with proprietary rights in Paradyn) are
26 * under no obligation to provide either maintenance services,
27 * update services, notices of latent defects, or correction of
28 * defects for Paradyn.
30 * Even if advised of the possibility of such damages, under no
31 * circumstances shall we (or any other person or entity with
32 * proprietary rights in the software licensed hereunder) be liable
33 * to you or any third party for direct, indirect, or consequential
34 * damages of any character regardless of type of action, including,
35 * without limitation, loss of profits, loss of use, loss of good
36 * will, or computer failure or malfunction. You agree to indemnify
37 * us (and any other person or entity with proprietary rights in the
38 * software licensed hereunder) for any and all liability it may
39 * incur to third parties resulting from your use of Paradyn.
43 * inst-x86.C - x86 dependent functions and code generator
45 * $Log: inst-x86.C,v $
46 * Revision 1.27 1997/09/28 22:22:31 buck
47 * Added some more #ifdef BPATCH_LIBRARYs to eliminate some Dyninst API
48 * library dependencies on files in rtinst.
50 * Revision 1.26 1997/08/19 19:50:38 naim
51 * Adding support to dynamically link libdyninstRT by using dlopen on sparc-
54 * Revision 1.25 1997/08/18 01:34:23 buck
55 * Ported the Dyninst API to Windows NT.
57 * Revision 1.1.1.5 1997/07/08 20:02:44 buck
58 * Bring latest changes from Wisconsin over to Maryland repository.
60 * Revision 1.24 1997/07/08 19:15:13 buck
61 * Added support for the x86 Solaris platform and dynamically linked
62 * executables to the dyninst API library.
64 * Revision 1.23 1997/06/23 17:09:29 tamches
65 * include of instPoint.h is new
67 * Revision 1.22 1997/06/14 18:27:48 ssuen
68 * Moved class instPoint from inst-x86.C to inst-x86.h and added/moved the following
69 * standard definitions to the public section
73 * pd_Function *callee_;
75 * function_base *iPgetFunction() const { ... }
76 * function_base *iPgetCallee() const { ... }
77 * const image *iPgetOwner() const { ... }
78 * Address iPgetAddress() const { ... }
80 * Revision 1.21 1997/06/06 18:27:15 mjrg
81 * Changed checkCallPoints to keep calls to shared object functions in the
82 * list of calls for a function
84 * Revision 1.20 1997/06/04 13:24:24 naim
85 * Removing debugging info - naim
87 * Revision 1.19 1997/05/23 23:01:26 mjrg
89 * bug fix to inst-x86.C
91 * Revision 1.18 1997/05/16 22:02:27 mjrg
92 * Fixed problem with instrumentation of conditional jumps
94 * Revision 1.17 1997/05/07 19:03:12 naim
95 * Getting rid of old support for threads and turning it off until the new
96 * version is finished. Additionally, new superTable, baseTable and superVector
97 * classes for future support of multiple threads. The fastInferiorHeap class has
100 * Revision 1.16 1997/05/02 18:25:36 mjrg
101 * Changes for allowing different main functions on different platforms
103 * Revision 1.15 1997/04/29 23:16:02 mjrg
104 * Changes for WindowsNT port
105 * Delayed check for DYNINST symbols to allow linking libdyninst dynamically
106 * Changed way paradyn and paradynd generate resource ids
107 * Changes to instPoint class in inst-x86.C to reduce size of objects
108 * Added initialization for process->threads to fork and attach constructors
110 * Revision 1.14 1997/04/14 00:22:07 newhall
111 * removed class pdFunction and replaced it with base class function_base and
112 * derived class pd_Function
114 * Revision 1.13 1997/02/26 23:42:52 mjrg
115 * First part on WindowsNT port: changes for compiling with Visual C++;
116 * moved unix specific code to unix.C
118 * Revision 1.12 1997/02/21 20:13:38 naim
119 * Moving files from paradynd to dyninstAPI + moving references to dataReqNode
120 * out of the ast class. The is the first pre-dyninstAPI commit! - naim
122 * Revision 1.11 1997/02/03 17:20:55 lzheng
123 * Changes made for combining the long jump and short jump on solaris platform
125 * Revision 1.10 1997/01/30 18:19:27 tamches
126 * emitInferiorRPCtrailer revamped; can now stop to read the result value of
129 * Revision 1.9 1997/01/27 19:40:57 naim
130 * Part of the base instrumentation for supporting multithreaded applications
131 * (vectors of counter/timers) implemented for all current platforms +
132 * different bug fixes - naim
134 * Revision 1.8 1997/01/21 23:58:53 mjrg
135 * Moved allocation of virtual registers to basetramp and fix to
136 * emitInferiorRPCheader to allocate virtual registers
138 * Revision 1.7 1997/01/21 00:27:58 tamches
139 * removed uses of DYNINSTglobalData
141 * Revision 1.6 1996/11/26 16:08:08 naim
142 * Fixing asserts - naim
144 * Revision 1.5 1996/11/19 16:28:04 newhall
145 * Fix to stack walking on Solaris: find leaf functions in stack (these can occur
146 * on top of stack or in middle of stack if the signal handler is on the stack)
147 * Fix to relocated functions: new instrumentation points are kept on a per
148 * process basis. Cleaned up some of the code.
150 * Revision 1.4 1996/11/14 14:27:09 naim
151 * Changing AstNodes back to pointers to improve performance - naim
153 * Revision 1.3 1996/11/12 17:48:28 mjrg
154 * Moved the computation of cost to the basetramp in the x86 platform,
155 * and changed other platform to keep code consistent.
156 * Removed warnings, and made changes for compiling with Visual C++
158 * Revision 1.2 1996/10/31 08:51:09 tamches
159 * the shm-sampling commit; routines to implement inferiorRPC; removed some
160 * warnings; added noCost param to some fns.
162 * Revision 1.1 1996/10/18 23:54:14 mjrg
170 #include "util/h/headers.h"
172 #ifndef BPATCH_LIBRARY
173 #include "rtinst/h/rtinst.h"
175 #include "dyninstAPI/src/symtab.h"
176 #include "dyninstAPI/src/process.h"
177 #include "dyninstAPI/src/inst.h"
178 #include "dyninstAPI/src/instP.h"
179 #include "dyninstAPI/src/ast.h"
180 #include "dyninstAPI/src/util.h"
181 #include "dyninstAPI/src/stats.h"
182 #include "dyninstAPI/src/os.h"
183 #include "paradynd/src/showerror.h"
185 #include "dyninstAPI/src/arch-x86.h"
186 #include "dyninstAPI/src/inst-x86.h"
187 #include "dyninstAPI/src/instPoint.h" // includes instPoint-x86.h
188 #include "dyninstAPI/src/instP.h" // class returnInstance
190 extern bool isPowerOf2(int value, int &result);
192 // The general machine registers.
193 // These values are taken from the Pentium manual and CANNOT be changed.
203 // Size of a jump rel32 instruction
204 #define JUMP_REL32_SZ (5)
206 // Size of a call rel32 instruction
207 #define CALL_REL32_SZ (5)
209 #define PUSH_RM_OPC1 (0xFF)
210 #define PUSH_RM_OPC2 (6)
211 #define CALL_RM_OPC1 (0xFF)
212 #define CALL_RM_OPC2 (2)
213 #define PUSH_EBP (0x50+EBP)
214 #define SUB_REG_IMM32 (5)
218 Function arguments are in the stack and are addressed with a displacement
219 from EBP. EBP points to the saved EBP, EBP+4 is the saved return address,
220 EBP+8 is the first parameter.
221 TODO: what about far calls?
224 #define PARAM_OFFSET (8)
227 // number of virtual registers
228 #define NUM_VIRTUAL_REGISTERS (32)
230 // offset from EBP of the saved EAX for a tramp
231 #define SAVED_EAX_OFFSET (-NUM_VIRTUAL_REGISTERS*4-4)
237 checkInstructions: check that there are no known jumps to the instructions
238 before and after the point.
240 void instPoint::checkInstructions() {
241 unsigned currAddr = addr_;
242 unsigned OKinsns = 0;
244 // if jumpAddr_ is not zero, this point has been checked already
249 unsigned maxSize = JUMP_SZ;
250 if (address() == func()->getAddress(0)) // entry point
252 tSize = insnAtPoint_.size();
254 if (!owner()->isJumpTarget(currAddr)) {
255 // check instructions before point
256 unsigned insnsBefore_ = insnsBefore();
257 for (unsigned u = 0; u < insnsBefore_; u++) {
259 tSize += (*insnBeforePt_)[u].size();
260 currAddr -= (*insnBeforePt_)[u].size();
261 if (owner()->isJumpTarget(currAddr)) {
262 // must remove instruction from point
263 // fprintf(stderr, "check instructions point %x, jmp to %x\n", addr,currAddr);
269 (*insnBeforePt_).resize(OKinsns);
271 // this is the address where we insert the jump
272 jumpAddr_ = currAddr;
274 // check instructions after point
275 currAddr = addr_ + insnAtPoint_.size();
277 unsigned insnsAfter_ = insnsAfter();
278 for (unsigned u = 0; tSize < maxSize && u < insnsAfter_; u++) {
279 if (owner()->isJumpTarget(currAddr))
282 unsigned size = (*insnAfterPt_)[u].size();
287 (*insnAfterPt_).resize(OKinsns);
290 if (tSize < maxSize) {
291 tSize = insnAtPoint_.size();
293 if (insnBeforePt_) (*insnBeforePt_).resize(0);
294 if (insnAfterPt_) (*insnAfterPt_).resize(0);
300 /**************************************************************
302 * machine dependent methods of pdFunction
304 **************************************************************/
306 // Determine if the called function is a "library" function or a "user" function
307 // This cannot be done until all of the functions have been seen, verified, and
310 void pd_Function::checkCallPoints() {
315 vector<instPoint*> non_lib;
317 for (i=0; i<calls.size(); ++i) {
318 /* check to see where we are calling */
322 if (!p->insnAtPoint().isCallIndir()) {
323 loc_addr = p->insnAtPoint().getTarget(p->address());
324 file()->exec()->addJumpTarget(loc_addr);
325 pd_Function *pdf = (file_->exec())->findFunction(loc_addr);
327 if (pdf && !pdf->isLibTag()) {
331 // if this is a call outside the fuction, keep it
332 if((loc_addr < getAddress(0))||(loc_addr > (getAddress(0)+size()))){
342 // Indirect call -- be conservative, assume it is a call to
343 // an unnamed user function
344 //assert(!p->callee());
353 // this function is not needed
354 Address pd_Function::newCallPoint(Address, const instruction,
355 const image *, bool &)
356 { assert(0); return 0; }
359 // see if we can recognize a jump table and skip it
360 // return the size of the table in tableSz.
361 bool checkJumpTable(image *im, instruction insn, Address addr,
366 const unsigned char *instr = insn.ptr();
369 the instruction usually used for jump tables is
370 jmp dword ptr [REG*4 + ADDR]
371 where ADDR is an immediate following the SIB byte.
372 The opcode is 0xFF and the MOD/RM byte is 0x24.
373 The SS field (bits 7 and 6) of SIB is 2, and the
374 base ( bits 2, 1, 0) is 5. The index bits (5,4,3)
377 if (instr[0] == 0xFF && instr[1] == 0x24 &&
378 ((instr[2] & 0xC0)>>6) == 2 && (instr[2] & 0x7) == 5) {
379 const unsigned tableBase = *(const int *)(instr+3);
380 //fprintf(stderr, "Found jump table at %x %x\n",addr, tableBase);
381 // check if the table is right after the jump and inside the current function
382 if (tableBase > funcBegin && tableBase < funcEnd) {
383 // table is within function code
384 if (tableBase < addr+insn.size()) {
385 fprintf(stderr, "bad indirect jump at %x\n", addr);
387 } else if (tableBase > addr+insn.size()) {
388 // jump table may be at the end of the function code - adjust funcEnd
392 // skip the jump table
393 for (const unsigned *ptr = (unsigned *)tableBase;
394 *ptr >= funcBegin && *ptr <= funcEnd; ptr++) {
395 //fprintf(stderr, " jump table entry = %x\n", *(unsigned *)ptr);
396 tableSz += sizeof(int);
400 const unsigned char *ptr = im->getPtrToInstruction(tableBase);
401 for ( ; *(const unsigned *)ptr >= funcBegin && *(const unsigned *)ptr <= funcEnd;
402 ptr += sizeof(unsigned)) {
403 //fprintf(stderr, " jump table entry = %x\n", *(unsigned *)ptr);
411 /* auxiliary data structures for function findInstPoints */
412 enum { EntryPt, CallPt, ReturnPt };
415 point_(): point(0), index(0), type(0) {};
416 point_(instPoint *p, unsigned i, unsigned t): point(p), index(i), type(t) {};
423 bool pd_Function::findInstPoints(const image *i_owner) {
424 // sorry this this hack, but this routine can modify the image passed in,
425 // which doesn't occur on other platforms --ari
426 image *owner = (image *)i_owner; // const cast
429 //fprintf(stderr,"Function %s, size = %d\n", prettyName().string_of(), size());
433 // XXXXX kludge: these functions are called by DYNINSTgetCPUtime,
434 // they can't be instrumented or we would have an infinite loop
435 if (prettyName() == "gethrvtime" || prettyName() == "_divdi3"
436 || prettyName() == "GetProcessTimes")
439 point_ *points = new point_[size()];
440 //point_ *points = (point_ *)alloca(size()*sizeof(point));
441 unsigned npoints = 0;
443 const unsigned char *instr = (const unsigned char *)owner->getPtrToInstruction(getAddress(0));
444 Address adr = getAddress(0);
445 unsigned numInsns = 0;
450 // keep a buffer with all the instructions in this function
451 instruction *allInstr = new instruction[size()+5];
452 //instruction *allInstr = (instruction *)alloca((size()+5)*sizeof(instruction));
454 // define the entry point
455 insnSize = insn.getNextInstruction(instr);
456 instPoint *p = new instPoint(this, owner, adr, insn);
458 points[npoints++] = point_(p, numInsns, EntryPt);
460 // check if the entry point contains another point
461 if (insn.isJumpDir()) {
462 Address target = insn.getTarget(adr);
463 owner->addJumpTarget(target);
464 if (target < getAddress(0) || target >= getAddress(0) + size()) {
465 // jump out of function
466 // this is an empty function
471 } else if (insn.isReturn()) {
472 // this is an empty function
476 } else if (insn.isCall()) {
477 // TODO: handle calls at entry point
478 // call at entry point
479 //instPoint *p = new instPoint(this, owner, adr, insn);
481 //points[npoints++] = point_(p, numInsns, CallPt);
482 //fprintf(stderr,"Function %s, call at entry point\n", prettyName().string_of());
488 allInstr[numInsns] = insn;
493 // get all the instructions for this function, and define the instrumentation
494 // points. For now, we only add one instruction to each point.
495 // Additional instructions, for the points that need them, will be added later.
497 Address funcEnd = getAddress(0) + size();
498 for ( ; adr < funcEnd; instr += insnSize, adr += insnSize) {
499 insnSize = insn.getNextInstruction(instr);
500 assert(insnSize > 0);
502 if (adr + insnSize > funcEnd) {
506 if (insn.isJumpIndir()) {
507 unsigned jumpTableSz;
508 // check for jump table. This may update funcEnd
509 if (!checkJumpTable(owner, insn, adr, getAddress(0), funcEnd, jumpTableSz)) {
512 //fprintf(stderr,"Function %s, size = %d, bad jump table\n",
513 // prettyName().string_of(), size());
516 // process the jump instruction
517 allInstr[numInsns] = insn;
520 if (jumpTableSz > 0) {
521 // skip the jump table
522 // insert an illegal instruction with the size of the jump table
523 insn = instruction(instr, ILLEGAL, jumpTableSz);
526 } else if (insn.isJumpDir()) {
527 // check for jumps out of this function
528 Address target = insn.getTarget(adr);
529 owner->addJumpTarget(target);
530 if (target < getAddress(0) || target >= getAddress(0) + size()) {
531 // jump out of function
532 instPoint *p = new instPoint(this, owner, adr, insn);
534 points[npoints++] = point_(p, numInsns, ReturnPt);
536 } else if (insn.isReturn()) {
537 instPoint *p = new instPoint(this, owner, adr, insn);
539 points[npoints++] = point_(p, numInsns, ReturnPt);
541 } else if (insn.isCall()) {
542 // calls to adr+5 are not really calls, they are used in dynamically linked
543 // libraries to get the address of the code.
544 // We skip them here.
545 if (insn.getTarget(adr) != adr + 5) {
546 instPoint *p = new instPoint(this, owner, adr, insn);
548 points[npoints++] = point_(p, numInsns, CallPt);
552 allInstr[numInsns] = insn;
554 assert(npoints < size());
555 assert(numInsns <= size());
559 // there are often nops after the end of the function. We get them here,
560 // since they may be usefull to instrument the return point
561 for (u = 0; u < 4; u++) {
562 if (owner->isValidAddress(adr)) {
563 insnSize = insn.getNextInstruction(instr);
565 allInstr[numInsns] = insn;
567 assert(numInsns < size()+5);
577 // add extra instructions to the points that need it.
578 unsigned lastPointEnd = 0;
579 unsigned thisPointEnd = 0;
580 for (u = 0; u < npoints; u++) {
581 instPoint *p = points[u].point;
582 unsigned index = points[u].index;
583 unsigned type = points[u].type;
584 lastPointEnd = thisPointEnd;
585 thisPointEnd = index;
587 // add instructions before the point
588 unsigned size = p->size();
589 for (int u1 = index-1; size < JUMP_SZ && u1 >= 0 && u1 > (int)lastPointEnd; u1--) {
590 if (!allInstr[u1].isCall()) {
591 p->addInstrBeforePt(allInstr[u1]);
592 size += allInstr[u1].size();
597 lastPointEnd = index;
598 // add instructions after the point
599 if (type == ReturnPt) {
600 // normally, we would not add instructions after the return, but the
601 // compilers often add nops after the return, and we can use them if necessary
602 for (unsigned u1 = index+1; u1 < index+JUMP_SZ-1 && u1 < numInsns; u1++) {
603 if (allInstr[u1].isNop() || *(allInstr[u1].ptr()) == 0xCC) {
604 p->addInstrAfterPt(allInstr[u1]);
612 unsigned maxSize = JUMP_SZ;
613 if (type == EntryPt) maxSize = 2*JUMP_SZ;
614 for (unsigned u1 = index+1; size < maxSize && u1 <= numInsns; u1++) {
615 if (u+1 < npoints && points[u+1].index > u1 && !allInstr[u1].isCall()) {
616 p->addInstrAfterPt(allInstr[u1]);
617 size += allInstr[u1].size();
634 * Given an instruction, relocate it to a new address, patching up
635 * any relative addressing that is present.
636 * The instruction may need to be replaced with a different size instruction
637 * or with multiple instructions.
638 * Return the size of the new instruction(s)
640 unsigned relocateInstruction(instruction insn,
641 int origAddr, int newAddr,
642 unsigned char *&newInsn)
645 Relative address instructions need to be modified. The relative address
646 can be a 8, 16, or 32-byte displacement relative to the next instruction.
647 Since we are relocating the instruction to a different area, we have
648 to replace 8 and 16-byte displacements with 32-byte displacements.
650 All relative address instructions are one or two-byte opcode followed
651 by a displacement relative to the next instruction:
653 CALL rel16 / CALL rel32
654 Jcc rel8 / Jcc rel16 / Jcc rel32
655 JMP rel8 / JMP rel16 / JMP rel32
657 The only two-byte opcode instructions are the Jcc rel16/rel32,
658 all others have one byte opcode.
660 The instruction JCXZ/JECXZ rel8 does not have an equivalent with rel32
661 displacement. We must generate code to emulate this instruction:
667 A0: JCXZ 2 (jump to A4)
668 A2: JMP 5 (jump to A9)
669 A4: JMP rel32 (relocated displacement)
674 const unsigned char *origInsn = insn.ptr();
675 unsigned insnType = insn.type();
676 unsigned insnSz = insn.size();
677 unsigned char *first = newInsn;
682 if (insnType & REL_B) {
683 /* replace with rel32 instruction, opcode is one byte. */
684 if (*origInsn == JCXZ) {
685 oldDisp = (int)*(const char *)(origInsn+1);
686 newDisp = (origAddr + 2) + oldDisp - (newAddr + 9);
687 *newInsn++ = *origInsn; *(newInsn++) = 2; // jcxz 2
688 *newInsn++ = 0xEB; *newInsn++ = 5; // jmp 5
689 *newInsn++ = 0xE9; // jmp rel32
690 *((int *)newInsn) = newDisp;
691 newInsn += sizeof(int);
695 if (insnType & IS_JCC) {
696 /* Change a Jcc rel8 to Jcc rel32.
697 Must generate a new opcode: a 0x0F followed by (old opcode + 16) */
698 unsigned char opcode = *origInsn++;
700 *newInsn++ = opcode + 0x10;
703 else if (insnType & IS_JUMP) {
704 /* change opcode to 0xE9 */
709 oldDisp = (int)*(const char *)origInsn;
710 newDisp = (origAddr + 2) + oldDisp - (newAddr + newSz);
711 *((int *)newInsn) = newDisp;
712 newInsn += sizeof(int);
715 else if (insnType & REL_W) {
717 if (insnType & PREFIX_OPR)
719 if (insnType & PREFIX_SEG)
721 /* opcode is unchanged, just relocate the displacement */
722 if (*origInsn == (unsigned char)0x0F)
723 *newInsn++ = *origInsn++;
724 *newInsn++ = *origInsn++;
725 oldDisp = *((const short *)origInsn);
726 newDisp = (origAddr + 5) + oldDisp - (newAddr + 3);
727 *((int *)newInsn) = newDisp;
728 newInsn += sizeof(int);
729 } else if (insnType & REL_D) {
731 unsigned nPrefixes = 0;
732 if (insnType & PREFIX_OPR)
734 if (insnType & PREFIX_SEG)
736 for (unsigned u = 0; u < nPrefixes; u++)
737 *newInsn++ = *origInsn++;
739 /* opcode is unchanged, just relocate the displacement */
740 if (*origInsn == 0x0F)
741 *newInsn++ = *origInsn++;
742 *newInsn++ = *origInsn++;
743 oldDisp = *((const int *)origInsn);
744 newDisp = (origAddr + insnSz) + oldDisp - (newAddr + insnSz);
745 *((int *)newInsn) = newDisp;
746 newInsn += sizeof(int);
749 /* instruction is unchanged */
750 for (unsigned u = 0; u < insnSz; u++)
751 *newInsn++ = *origInsn++;
754 return (newInsn - first);
759 * Relocate a conditional jump and change the target to newTarget.
760 * The new target must be within 128 bytes from the new address
761 * Size of instruction is unchanged.
762 * Returns the old target
764 unsigned changeConditionalJump(instruction insn,
765 int origAddr, int newAddr, int newTargetAddr,
766 unsigned char *&newInsn)
769 const unsigned char *origInsn = insn.ptr();
770 unsigned insnType = insn.type();
771 unsigned insnSz = insn.size();
776 if (insnType & REL_B) {
777 /* one byte opcode followed by displacement */
778 /* opcode is unchanged */
780 *newInsn++ = *origInsn++;
781 oldDisp = (int)*(const char *)origInsn;
782 newDisp = newTargetAddr - (newAddr + insnSz);
783 *newInsn++ = (char)newDisp;
785 else if (insnType & REL_W) {
787 if (insnType & PREFIX_OPR)
788 *newInsn++ = *origInsn++;
789 if (insnType & PREFIX_SEG)
790 *newInsn++ = *origInsn++;
792 assert(*origInsn==0x0F);
793 *newInsn++ = *origInsn++; // copy the 0x0F
794 *newInsn++ = *origInsn++; // second opcode byte
796 oldDisp = *((const short *)origInsn);
797 newDisp = newTargetAddr - (newAddr + insnSz);
798 *((short *)newInsn) = (short)newDisp;
799 newInsn += sizeof(short);
801 else if (insnType & REL_D) {
803 if (insnType & PREFIX_OPR)
804 *newInsn++ = *origInsn++;
805 if (insnType & PREFIX_SEG)
806 *newInsn++ = *origInsn++;
808 assert(*origInsn==0x0F);
809 *newInsn++ = *origInsn++; // copy the 0x0F
810 *newInsn++ = *origInsn++; // second opcode byte
812 oldDisp = *((const int *)origInsn);
813 newDisp = newTargetAddr - (newAddr + insnSz);
814 *((int *)newInsn) = (int)newDisp;
815 newInsn += sizeof(int);
818 return (origAddr+insnSz+oldDisp);
823 unsigned getRelocatedInstructionSz(instruction insn)
825 const unsigned char *origInsn = insn.ptr();
826 unsigned insnType = insn.type();
827 unsigned insnSz = insn.size();
829 if (insnType & REL_B) {
830 if (*origInsn == JCXZ)
833 if (insnType & IS_JCC)
835 else if (insnType & IS_JUMP) {
840 else if (insnType & REL_W) {
847 registerSpace *regSpace;
850 bool registerSpace::readOnlyRegister(reg) {
855 We don't use the machine registers to store temporaries,
856 but "virtual registers" that are located on the stack.
857 The stack frame for a tramp is:
859 ebp-> saved ebp (4 bytes)
860 ebp-4: 128-byte space for 32 virtual registers (32*4 bytes)
861 ebp-132: saved registers (8*4 bytes)
862 ebp-164: saved flags registers (4 bytes)
864 The temporaries are assigned numbers from 1 so that it is easier
865 to refer to them: -(reg*4)[ebp]. So the first reg is -4[ebp].
867 We are using a fixed number of temporaries now (32), but we could
868 change to using an arbitrary number.
871 int deadList[NUM_VIRTUAL_REGISTERS];
872 int deadListSize = sizeof(deadList);
876 static bool inited=false;
881 #if defined(SHM_SAMPLING) && defined(MT_THREAD)
882 for (unsigned u = 0; u < NUM_VIRTUAL_REGISTERS-1; u++) {
884 for (unsigned u = 0; u < NUM_VIRTUAL_REGISTERS; u++) {
889 regSpace = new registerSpace(sizeof(deadList)/sizeof(int), deadList, 0, NULL);
893 void emitJump(unsigned disp32, unsigned char *&insn);
894 void emitSimpleInsn(unsigned opcode, unsigned char *&insn);
895 void emitMovRegToReg(reg dest, reg src, unsigned char *&insn);
896 void emitAddMemImm32(Address dest, int imm, unsigned char *&insn);
897 void emitOpRegImm(int opcode, reg dest, int imm, unsigned char *&insn);
898 void emitMovRegToRM(reg base, int disp, reg src, unsigned char *&insn);
899 void emitMovRMToReg(reg dest, reg base, int disp, unsigned char *&insn);
900 void emitCallRel32(unsigned disp32, unsigned char *&insn);
902 void generateMTpreamble(char *insn, unsigned &base, process *proc)
904 AstNode *t1,*t2,*t3,*t4,*t5;;
905 vector<AstNode *> dummy;
911 /* t3=DYNINSTthreadTable[thr_self()] */
912 t1 = new AstNode("DYNINSTthreadPos", dummy);
913 value = sizeof(unsigned);
914 t4 = new AstNode(AstNode::Constant,(void *)value);
915 t2 = new AstNode(timesOp, t1, t4);
919 tableAddr = proc->findInternalAddress("DYNINSTthreadTable",true,err);
921 t5 = new AstNode(AstNode::Constant, (void *)tableAddr);
922 t3 = new AstNode(plusOp, t2, t5);
925 src = t3->generateCode(proc, regSpace, insn, base, false);
927 // this instruction is different on every platform
928 unsigned char *tmp_insn = (unsigned char *) (&insn[base]);
929 emitMovRMToReg(EAX, EBP, -(src*4), tmp_insn);
930 emitMovRegToRM(EBP, -(REG_MT*4), EAX, tmp_insn);
931 base += (unsigned)tmp_insn - (unsigned)(&insn[base]);
932 regSpace->freeRegister(src);
936 * change the insn at addr to be a branch to newAddr.
937 * Used to add multiple tramps to a point.
939 void generateBranch(process *proc, unsigned fromAddr, unsigned newAddr)
941 unsigned char inst[JUMP_REL32_SZ+1];
942 unsigned char *insn = inst;
943 emitJump(newAddr - (fromAddr + JUMP_REL32_SZ), insn);
944 proc->writeTextSpace((caddr_t)fromAddr, JUMP_REL32_SZ, (caddr_t)inst);
948 bool insertInTrampTable(process *proc, unsigned key, unsigned val) {
951 // check for overflow of the tramp table.
952 // stop at 95% capacicty to ensure good performance
953 if (proc->trampTableItems == (TRAMPTABLESZ - TRAMPTABLESZ/20))
955 proc->trampTableItems++;
956 for (u = HASH1(key); proc->trampTable[u].key != 0;
957 u = (u + HASH2(key)) % TRAMPTABLESZ)
959 proc->trampTable[u].key = key;
960 proc->trampTable[u].val = val;
962 #if !defined(i386_unknown_nt4_0)
964 Address addr = proc->findInternalAddress("DYNINSTtrampTable",true, err);
966 return proc->writeTextSpace((caddr_t)addr+u*sizeof(trampTableEntry),
967 sizeof(trampTableEntry),
968 (caddr_t)&(proc->trampTable[u]));
974 // generate a jump to a base tramp or a trap
975 // return the size of the instruction generated
976 unsigned generateBranchToTramp(process *proc, const instPoint *point, unsigned baseAddr,
977 Address imageBaseAddr, unsigned char *insn) {
978 if (point->size() < JUMP_REL32_SZ) {
980 // the point is not big enough for a jump
981 // First, check if we can use an indirection with the entry point
982 // if that is not possible, we must use a trap
983 // We get 10 bytes for the entry points, instead of the usual five,
984 // so that we have space for an extra jump. We can then insert a
985 // jump to the basetramp in the second slot of the base tramp
986 // and use a short 2-byte jump from the point to the second jump.
987 // We adopt the following rule: Only one point in the function
988 // can use the indirect jump, and this is the first return point
989 // with a size that is less than five bytes
990 pd_Function *f = point->func();
991 vector<instPoint *>fReturns = f->funcExits(proc);
993 // first check if this point can use the extra slot in the entry point
995 for (unsigned u = 0; u < fReturns.size(); u++) {
996 if (fReturns[u] == point) {
999 } else if (fReturns[u]->size() < JUMP_SZ)
1003 const instPoint *entry = f->funcEntry(proc);
1004 if (proc->baseMap.defines(entry) && entry->size() >= 2*JUMP_SZ) {
1005 assert(point->jumpAddr() > entry->address());
1006 // actual displacement needs to subtract size of instruction (2 bytes)
1007 int displacement = entry->address() + 5 - point->jumpAddr();
1008 assert(displacement < 0);
1009 if (point->size() >= 2 && (displacement-2) > SCHAR_MIN) {
1010 generateBranch(proc, entry->address()+5, baseAddr);
1012 *insn++ = (char)(displacement-2);
1019 //sprintf(errorLine, "Warning: unable to insert jump in function %s, address %x. Using trap\n",point->func()->prettyName().string_of(),point->address());
1020 //logLine(errorLine);
1022 if (!insertInTrampTable(proc, point->jumpAddr()+imageBaseAddr, baseAddr))
1028 // replace instructions at point with jump to base tramp
1029 emitJump(baseAddr - (point->jumpAddr() + imageBaseAddr + JUMP_REL32_SZ), insn);
1030 return JUMP_REL32_SZ;
1036 * Install a base tramp, relocating the instructions at location
1037 * The pre and post jumps are filled with a 'jump 0'
1038 * Return a descriptor for the base tramp.
1042 trampTemplate *installBaseTramp(const instPoint *&location, process *proc, bool noCost)
1047 addr instruction cost
1048 0: <relocated instructions before point>
1049 a = size of relocated instructions before point
1050 a+0: jmp a+30 <skip pre insn> 1
1053 a+8: subl esp, 0x80 1
1056 a+16: jmp <global pre inst> 1
1057 a+21: jmp <local pre inst> 1
1061 a+29: add costAddr, cost 3
1062 a+39: <relocated instructions at point>
1064 b = a +30 + size of relocated instructions at point
1065 b+0: jmp b+30 <skip post insn>
1071 b+16: jmp <global post inst>
1072 b+21: jmp <local post inst>
1076 b+29: <relocated instructions after point>
1078 c: jmp <return to user code>
1080 tramp size = 2*23 + 10 + 5 + size of relocated instructions
1081 Make sure to update the size if the tramp is changed
1083 cost of pre and post instrumentation is (1+1+1+5+9+1+1+15+5+3) = 42
1084 cost of rest of tramp is (1+3+1+1)
1089 trampTemplate *ret = new trampTemplate;
1091 unsigned jccTarget = 0; // used when the instruction at the point is a cond. jump
1092 unsigned auxJumpOffset = 0;
1094 // check instructions at this point to find how many instructions
1095 // we should relocate
1096 instPoint *temp = (instPoint *)location;
1097 temp->checkInstructions();
1099 // compute the tramp size
1100 // if there are any changes to the tramp, the size must be updated.
1101 #if defined(SHM_SAMPLING) && defined(MT_THREAD)
1102 unsigned trampSize = 73+2*27;
1104 unsigned trampSize = 73;
1106 for (u = 0; u < location->insnsBefore(); u++) {
1107 trampSize += getRelocatedInstructionSz(location->insnBeforePt(u));
1109 if (location->insnAtPoint().type() & IS_JCC)
1110 trampSize += location->insnAtPoint().size() + 2 * JUMP_SZ;
1112 trampSize += getRelocatedInstructionSz(location->insnAtPoint());
1113 for (u = 0; u < location->insnsAfter(); u++) {
1114 trampSize += getRelocatedInstructionSz(location->insnAfterPt(u));
1117 Address imageBaseAddr;
1118 if (!proc->getBaseAddress(location->owner(), imageBaseAddr)) {
1122 Address costAddr = 0; // for now...
1125 costAddr = (Address)proc->getObsCostLowAddrInApplicSpace();
1128 // get address of DYNINSTobsCostLow to update observed cost
1130 costAddr = proc->findInternalAddress("DYNINSTobsCostLow",
1132 assert(costAddr && !err);
1136 ret->size = trampSize;
1137 unsigned baseAddr = inferiorMalloc(proc, trampSize, textHeap);
1138 ret->baseAddr = baseAddr;
1140 unsigned char *code = new unsigned char[2*trampSize];
1141 unsigned char *insn = code;
1142 unsigned currAddr = baseAddr;
1144 // get the current instruction that is being executed. If the PC is at a
1145 // instruction that is being relocated, we must change the PC.
1146 unsigned currentPC = proc->currentPC();
1148 // emulate the instructions before the point
1149 unsigned origAddr = location->jumpAddr() + imageBaseAddr;
1150 for (u = location->insnsBefore(); u > 0; ) {
1152 if (currentPC == origAddr) {
1153 //fprintf(stderr, "changed PC: %x to %x\n", currentPC, currAddr);
1154 proc->setNewPC(currAddr);
1157 unsigned newSize = relocateInstruction(location->insnBeforePt(u), origAddr, currAddr, insn);
1158 aflag=(newSize == getRelocatedInstructionSz(location->insnBeforePt(u)));
1160 currAddr += newSize;
1161 origAddr += location->insnBeforePt(u).size();
1166 If the instruction at the point is a conditional jump, we relocate it to
1167 the top of the base tramp, and change the code so that the tramp is executed
1168 only if the branch is taken.
1181 T2: relocated instructions after point
1183 then later at the base tramp, at the point where we relocate the instruction
1184 at the point, we insert a jump to target
1186 if (location->insnAtPoint().type() & IS_JCC) {
1187 currAddr = baseAddr + (insn - code);
1188 assert(origAddr == location->address() + imageBaseAddr);
1189 origAddr = location->address() + imageBaseAddr;
1190 if (currentPC == origAddr) {
1191 //fprintf(stderr, "changed PC: %x to %x\n", currentPC, currAddr);
1192 proc->setNewPC(currAddr);
1195 jccTarget = changeConditionalJump(location->insnAtPoint(), origAddr, currAddr,
1196 currAddr+location->insnAtPoint().size()+5, insn);
1197 currAddr += location->insnAtPoint().size();
1198 auxJumpOffset = insn-code;
1200 origAddr += location->insnAtPoint().size();
1204 // skip pre instrumentation
1205 ret->skipPreInsOffset = insn-code;
1208 // save registers and create a new stack frame for the tramp
1209 emitSimpleInsn(PUSH_EBP, insn); // push ebp
1210 emitMovRegToReg(EBP, ESP, insn); // mov ebp, esp (2-byte instruction)
1211 // allocate space for temporaries (virtual registers)
1212 emitOpRegImm(5, ESP, 128, insn); // sub esp, 128
1213 emitSimpleInsn(PUSHAD, insn); // pushad
1214 emitSimpleInsn(PUSHFD, insn); // pushfd
1216 #if defined(SHM_SAMPLING) && defined(MT_THREAD)
1217 // generate preamble for MT version
1219 generateMTpreamble((char *)insn, base, proc);
1223 // global pre branch
1224 ret->globalPreOffset = insn-code;
1228 ret->localPreOffset = insn-code;
1231 ret->localPreReturnOffset = insn-code;
1233 // restore registers
1234 emitSimpleInsn(POPFD, insn); // popfd
1235 emitSimpleInsn(POPAD, insn); // popad
1236 emitSimpleInsn(LEAVE, insn); // leave
1239 // update cost -- a 10-byte instruction
1240 ret->updateCostOffset = insn-code;
1241 currAddr = baseAddr + (insn-code);
1242 ret->costAddr = currAddr;
1244 emitAddMemImm32(costAddr, 88, insn); // add (costAddr), cost
1247 // minor hack: we still need to fill up the rest of the 10 bytes, since
1248 // assumptions are made about the positioning of instructions that follow.
1249 // (This could in theory be fixed)
1250 // So, 10 NOP instructions (each 1 byte)
1251 for (unsigned foo=0; foo < 10; foo++)
1252 emitSimpleInsn(0x90, insn); // NOP
1256 if (!(location->insnAtPoint().type() & IS_JCC)) {
1257 // emulate the instruction at the point
1258 ret->emulateInsOffset = insn-code;
1259 currAddr = baseAddr + (insn - code);
1260 assert(origAddr == location->address() + imageBaseAddr);
1261 origAddr = location->address() + imageBaseAddr;
1262 if (currentPC == origAddr) {
1263 //fprintf(stderr, "changed PC: %x to %x\n", currentPC, currAddr);
1264 proc->setNewPC(currAddr);
1267 unsigned newSize = relocateInstruction(location->insnAtPoint(), origAddr, currAddr, insn);
1268 aflag=(newSize == getRelocatedInstructionSz(location->insnAtPoint()));
1270 currAddr += newSize;
1271 origAddr += location->insnAtPoint().size();
1273 // instruction at point is a conditional jump.
1274 // The instruction was relocated to the beggining of the tramp (see comments above)
1275 // We must generate a jump to the original target here
1276 assert(jccTarget > 0);
1277 currAddr = baseAddr + (insn - code);
1278 emitJump(jccTarget-(currAddr+JUMP_SZ), insn);
1279 currAddr += JUMP_SZ;
1283 // skip post instrumentation
1284 ret->skipPostInsOffset = insn-code;
1288 // save registers and create a new stack frame for the tramp
1289 emitSimpleInsn(PUSH_EBP, insn); // push ebp
1290 emitMovRegToReg(EBP, ESP, insn); // mov ebp, esp
1291 // allocate space for temporaries (virtual registers)
1292 emitOpRegImm(5, ESP, 128, insn); // sub esp, 128
1293 emitSimpleInsn(PUSHAD, insn); // pushad
1294 emitSimpleInsn(PUSHFD, insn); // pushfd
1296 #if defined(SHM_SAMPLING) && defined(MT_THREAD)
1297 // generate preamble for MT version
1299 generateMTpreamble((char *)insn, base, proc);
1303 // global post branch
1304 ret->globalPostOffset = insn-code;
1307 // local post branch
1308 ret->localPostOffset = insn-code;
1311 ret->localPostReturnOffset = insn-code;
1313 // restore registers
1314 emitSimpleInsn(POPFD, insn); // popfd
1315 emitSimpleInsn(POPAD, insn); // popad
1316 emitSimpleInsn(LEAVE, insn); // leave
1318 // emulate the instructions after the point
1319 ret->returnInsOffset = insn-code;
1320 currAddr = baseAddr + (insn - code);
1321 assert(origAddr == location->address() + imageBaseAddr + location->insnAtPoint().size());
1322 origAddr = location->address() + imageBaseAddr + location->insnAtPoint().size();
1323 for (u = 0; u < location->insnsAfter(); u++) {
1324 if (currentPC == origAddr) {
1325 //fprintf(stderr, "changed PC: %x to %x\n", currentPC, currAddr);
1326 proc->setNewPC(currAddr);
1328 unsigned newSize = relocateInstruction(location->insnAfterPt(u), origAddr, currAddr, insn);
1329 aflag=(newSize == getRelocatedInstructionSz(location->insnAfterPt(u)));
1331 currAddr += newSize;
1332 origAddr += location->insnAfterPt(u).size();
1335 // return to user code
1336 currAddr = baseAddr + (insn - code);
1337 emitJump((location->returnAddr() + imageBaseAddr) - (currAddr+JUMP_SZ), insn);
1339 assert((unsigned)(insn-code) == trampSize);
1341 // update the jumps to skip pre and post instrumentation
1342 unsigned char *ip = code + ret->skipPreInsOffset;
1343 emitJump(ret->updateCostOffset - (ret->skipPreInsOffset+JUMP_SZ), ip);
1344 ip = code + ret->skipPostInsOffset;
1345 emitJump(ret->returnInsOffset - (ret->skipPostInsOffset+JUMP_SZ), ip);
1347 if (auxJumpOffset > 0) {
1348 ip = code + auxJumpOffset;
1349 emitJump(ret->returnInsOffset - (auxJumpOffset+JUMP_SZ), ip);
1352 // put the tramp in the application space
1353 proc->writeDataSpace((caddr_t)baseAddr, insn-code, (caddr_t) code);
1359 // The cost for generateMTpreamble is 25 for pre and post instrumentation:
1360 // movl $0x80570ec,%eax 1
1362 // movl %eax,0xfffffffc(%ebp) 1
1363 // shll $0x2,0xfffffffc(%ebp) 12
1364 // addl $0x84ac670,0xfffffffc(%ebp) 4
1365 // movl 0xfffffffc(%ebp),%eax 1
1366 // movl %eax,0xffffff80(%ebp) 1
1368 ret->prevBaseCost = 42+25;
1369 ret->postBaseCost = 42+25;
1370 ret->prevInstru = false;
1371 ret->postInstru = false;
1376 // This function is used to clear a jump from base to minitramps
1377 // For the x86 platform, we generate a jump to the next instruction
1378 void generateNoOp(process *proc, int addr)
1380 static unsigned char jump0[5] = { 0xE9, 0, 0, 0, 0 };
1381 proc->writeDataSpace((caddr_t) addr, 5, (caddr_t)jump0);
1385 trampTemplate *findAndInstallBaseTramp(process *proc,
1386 instPoint *&location,
1387 returnInstance *&retInstance,
1393 if (!proc->baseMap.defines(location)) {
1394 ret = installBaseTramp(location, proc, noCost);
1395 proc->baseMap[location] = ret;
1396 // generate branch from instrumentation point to base tramp
1397 unsigned imageBaseAddr;
1398 if (!proc->getBaseAddress(location->owner(), imageBaseAddr))
1400 unsigned char *insn = new unsigned char[JUMP_REL32_SZ];
1401 unsigned size = generateBranchToTramp(proc, location, (int)ret->baseAddr,
1402 imageBaseAddr, insn);
1405 retInstance = new returnInstance(new instruction(insn, 0, size), size,
1406 location->jumpAddr() + imageBaseAddr, size);
1408 ret = proc->baseMap[location];
1415 * Install a single mini-tramp.
1418 void installTramp(instInstance *inst, char *code, int codeSize)
1421 //insnGenerated += codeSize/sizeof(int);
1422 (inst->proc)->writeDataSpace((caddr_t)inst->trampBase, codeSize, code);
1424 if (inst->when == callPreInsn) {
1425 if (inst->baseInstance->prevInstru == false) {
1426 atAddr = inst->baseInstance->baseAddr+inst->baseInstance->skipPreInsOffset;
1427 inst->baseInstance->cost += inst->baseInstance->prevBaseCost;
1428 inst->baseInstance->prevInstru = true;
1429 generateNoOp(inst->proc, atAddr);
1433 if (inst->baseInstance->postInstru == false) {
1434 atAddr = inst->baseInstance->baseAddr+inst->baseInstance->skipPostInsOffset;
1435 inst->baseInstance->cost += inst->baseInstance->postBaseCost;
1436 inst->baseInstance->postInstru = true;
1437 generateNoOp(inst->proc, atAddr);
1443 /**************************************************************
1445 * code generator for x86
1447 **************************************************************/
1452 #define MAX_BRANCH (0x1<<31)
1454 unsigned getMaxBranch() {
1455 return (unsigned)MAX_BRANCH;
1459 bool doNotOverflow(int)
1462 // this should be changed by the correct code. If there isn't any case to
1463 // be checked here, then the function should return TRUE. If there isn't
1464 // any immediate code to be generated, then it should return FALSE - naim
1466 // any int value can be an immediate on the pentium
1472 /* build the MOD/RM byte of an instruction */
1473 inline unsigned char makeModRMbyte(unsigned Mod, unsigned Reg, unsigned RM) {
1474 return ((Mod & 0x3) << 6) + ((Reg & 0x7) << 3) + (RM & 0x7);
1478 Emit the ModRM byte and displacement for addressing modes.
1479 base is a register (EAX, ECX, EDX, EBX, EBP, ESI, EDI)
1480 disp is a displacement
1481 reg_opcode is either a register or an opcode
1483 void emitAddressingMode(int base, int disp, int reg_opcode,
1484 unsigned char *&insn) {
1485 assert(base != ESP);
1487 *insn++ = makeModRMbyte(0, reg_opcode, 5);
1488 *((int *)insn) = disp;
1489 insn += sizeof(int);
1490 } else if (disp == 0 && base != EBP) {
1491 *insn++ = makeModRMbyte(0, reg_opcode, base);
1492 } else if (disp >= -128 && disp <= 127) {
1493 *insn++ = makeModRMbyte(1, reg_opcode, base);
1494 *((char *)insn++) = (char) disp;
1496 *insn++ = makeModRMbyte(2, reg_opcode, base);
1497 *((int *)insn) = disp;
1498 insn += sizeof(int);
1503 /* emit a simple one-byte instruction */
1504 void emitSimpleInsn(unsigned op, unsigned char *&insn) {
1508 // emit a simple register to register instruction: OP dest, src
1509 // opcode is one or two byte
1510 void emitOpRegReg(unsigned opcode, reg dest, reg src, unsigned char *&insn) {
1514 *insn++ = opcode >> 8;
1515 *insn++ = opcode & 0xFF;
1517 // ModRM byte define the operands: Mod = 3, Reg = dest, RM = src
1518 *insn++ = makeModRMbyte(3, dest, src);
1522 void emitOpRegRM(unsigned opcode, reg dest, reg base, int disp, unsigned char *&insn) {
1524 emitAddressingMode(base, disp, dest, insn);
1528 void emitOpRMReg(unsigned opcode, reg base, int disp, reg src, unsigned char *&insn) {
1530 emitAddressingMode(base, disp, src, insn);
1533 // emit OP reg, imm32
1534 void emitOpRegImm(int opcode, reg dest, int imm, unsigned char *&insn) {
1536 *insn++ = makeModRMbyte(3, opcode, dest);
1537 *((int *)insn) = imm;
1542 // emit OP r/m, imm32
1543 void emitOpRMImm(unsigned opcode, reg base, int disp, int imm, unsigned char *&insn) {
1545 emitAddressingMode(base, disp, opcode, insn);
1546 *((int *)insn) = imm;
1547 insn += sizeof(int);
1551 // emit OP r/m, imm32
1552 void emitOpRMImm(unsigned opcode1, unsigned opcode2,
1553 reg base, int disp, int imm, unsigned char *&insn) {
1555 emitAddressingMode(base, disp, opcode2, insn);
1556 *((int *)insn) = imm;
1557 insn += sizeof(int);
1560 // emit OP r/m, imm8
1561 void emitOpRMImm8(unsigned opcode1, unsigned opcode2,
1562 reg base, int disp, char imm, unsigned char *&insn) {
1564 emitAddressingMode(base, disp, opcode2, insn);
1568 // emit OP reg, r/m, imm32
1569 void emitOpRegRMImm(unsigned opcode, reg dest,
1570 reg base, int disp, int imm, unsigned char *&insn) {
1572 emitAddressingMode(base, disp, dest, insn);
1573 *((int *)insn) = imm;
1574 insn += sizeof(int);
1577 // emit MOV reg, reg
1578 void emitMovRegToReg(reg dest, reg src, unsigned char *&insn) {
1580 *insn++ = makeModRMbyte(3, dest, src);
1583 // emit MOV reg, r/m
1584 void emitMovRMToReg(reg dest, reg base, int disp, unsigned char *&insn) {
1586 emitAddressingMode(base, disp, dest, insn);
1589 // emit MOV r/m, reg
1590 void emitMovRegToRM(reg base, int disp, reg src, unsigned char *&insn) {
1592 emitAddressingMode(base, disp, src, insn);
1596 void emitMovRegToM(int disp, reg src, unsigned char *&insn) {
1598 emitAddressingMode(-1, disp, src, insn);
1602 void emitMovMToReg(reg dest, int disp, unsigned char *&insn) {
1604 emitAddressingMode(-1, disp, dest, insn);
1607 // emit MOV reg, imm32
1608 void emitMovImmToReg(reg dest, int imm, unsigned char *&insn) {
1609 *insn++ = 0xB8 + dest;
1610 *((int *)insn) = imm;
1611 insn += sizeof(int);
1614 // emit MOV r/m32, imm32
1615 void emitMovImmToRM(reg base, int disp, int imm, unsigned char *&insn) {
1617 emitAddressingMode(base, disp, 0, insn);
1618 *((int*)insn) = imm;
1619 insn += sizeof(int);
1622 // emit MOV mem32, imm32
1623 void emitMovImmToMem(unsigned maddr, int imm, unsigned char *&insn) {
1625 // emit the ModRM byte: we use a 32-bit displacement for the address,
1626 // the ModRM value is 0x05
1628 *((unsigned *)insn) = maddr;
1629 insn += sizeof(unsigned);
1630 *((int*)insn) = imm;
1631 insn += sizeof(int);
1635 // emit Add dword ptr DS:[addr], imm
1636 void emitAddMemImm32(Address addr, int imm, unsigned char *&insn) {
1639 *((unsigned *)insn) = addr;
1640 insn += sizeof(unsigned);
1641 *((int *)insn) = imm;
1642 insn += sizeof(int);
1646 void emitJump(unsigned disp32, unsigned char *&insn) {
1648 *((int *)insn) = disp32;
1649 insn += sizeof(int);
1653 void emitCallRel32(unsigned disp32, unsigned char *&insn) {
1655 *((int *)insn) = disp32;
1656 insn += sizeof(int);
1659 // set dest=1 if src1 op src2, otherwise dest = 0
1660 void emitRelOp(unsigned op, reg dest, reg src1, reg src2, unsigned char *&insn) {
1661 //fprintf(stderr,"Relop dest = %d, src1 = %d, src2 = %d\n", dest, src1, src2);
1662 emitOpRegReg(0x29, ECX, ECX, insn); // clear ECX
1663 emitMovRMToReg(EAX, EBP, -(src1*4), insn); // mov eax, -(src1*4)[ebp]
1664 emitOpRegRM(0x3B, EAX, EBP, -(src2*4), insn); // cmp eax, -(src2*4)[ebp]
1665 unsigned char opcode;
1667 case eqOp: opcode = JNE_R8; break;
1668 case neOp: opcode = JE_R8; break;
1669 case lessOp: opcode = JGE_R8; break;
1670 case leOp: opcode = JG_R8; break;
1671 case greaterOp: opcode = JLE_R8; break;
1672 case geOp: opcode = JL_R8; break;
1675 *insn++ = opcode; *insn++ = 1; // jcc 1
1676 emitSimpleInsn(0x40+ECX, insn); // inc ECX
1677 emitMovRegToRM(EBP, -(dest*4), ECX, insn); // mov -(dest*4)[ebp], ecx
1681 // set dest=1 if src1 op src2imm, otherwise dest = 0
1682 void emitRelOpImm(unsigned op, reg dest, reg src1, int src2imm, unsigned char *&insn) {
1683 //fprintf(stderr,"Relop dest = %d, src1 = %d, src2 = %d\n", dest, src1, src2);
1684 emitOpRegReg(0x29, ECX, ECX, insn); // clear ECX
1685 emitMovRMToReg(EAX, EBP, -(src1*4), insn); // mov eax, -(src1*4)[ebp]
1686 emitOpRegImm(0x3D, EAX, src2imm, insn); // cmp eax, src2
1687 unsigned char opcode;
1689 case eqOp: opcode = JNE_R8; break;
1690 case neOp: opcode = JE_R8; break;
1691 case lessOp: opcode = JGE_R8; break;
1692 case leOp: opcode = JG_R8; break;
1693 case greaterOp: opcode = JLE_R8; break;
1694 case geOp: opcode = JL_R8; break;
1697 *insn++ = opcode; *insn++ = 1; // jcc 1
1698 emitSimpleInsn(0x40+ECX, insn); // inc ECX
1699 emitMovRegToRM(EBP, -(dest*4), ECX, insn); // mov -(dest*4)[ebp], ecx
1703 void emitEnter(short imm16, unsigned char *&insn) {
1705 *((short*)insn) = imm16;
1706 insn += sizeof(short);
1712 unsigned emitFuncCall(opCode op,
1714 char *ibuf, unsigned &base,
1715 const vector<AstNode *> &operands,
1716 const string &callee, process *proc,
1719 assert(op == callOp);
1724 addr = proc->findInternalAddress(callee, false, err);
1726 function_base *func = proc->findOneFunction(callee);
1728 ostrstream os(errorLine, 1024, ios::out);
1729 os << "Internal error: unable to find addr of " << callee << endl;
1731 showErrorCallback(80, (const char *) errorLine);
1734 addr = func->getAddress(0);
1737 for (unsigned u = 0; u < operands.size(); u++)
1738 srcs += operands[u]->generateCode(proc, rs, ibuf, base, noCost);
1740 unsigned char *insn = (unsigned char *) ((void*)&ibuf[base]);
1741 unsigned char *first = insn;
1743 // push arguments in reverse order, last argument first
1744 // must use int instead of unsigned to avoid nasty underflow problem:
1745 for (int i=srcs.size() - 1 ; i >= 0; i--) {
1746 emitOpRMReg(PUSH_RM_OPC1, EBP, -(srcs[i]*4), PUSH_RM_OPC2, insn);
1747 rs->freeRegister(srcs[i]);
1751 // we are using an indirect call here because we don't know the
1752 // address of this instruction, so we can't use a relative call.
1753 // TODO: change this to use a direct call
1754 emitMovImmToReg(EAX, addr, insn); // mov eax, addr
1755 emitOpRegReg(CALL_RM_OPC1, CALL_RM_OPC2, EAX, insn); // call *(eax)
1757 // reset the stack pointer
1758 if (srcs.size() > 0)
1759 emitOpRegImm(0, ESP, srcs.size()*4, insn); // add esp, srcs.size()*4
1761 // allocate a (virtual) register to store the return value
1762 reg ret = rs->allocateRegister((char *)insn, base, noCost);
1763 emitMovRegToRM(EBP, -(ret*4), EAX, insn);
1765 base += insn - first;
1772 * emit code for op(sr1,src2, dest)
1773 * ibuf is an instruction buffer where instructions are generated
1774 * base is the next free position on ibuf where code is to be generated
1776 unsigned emit(opCode op, reg src1, reg src2, reg dest, char *ibuf, unsigned &base,
1779 unsigned char *insn = (unsigned char *) (&ibuf[base]);
1780 unsigned char *first = insn;
1782 if (op == updateCostOp) {
1783 // src1 is the cost value
1785 // desc is the address of observed cost
1788 // update observed cost
1789 // dest = address of DYNINSTobsCostLow
1791 emitAddMemImm32(dest, src1, insn); // ADD (dest), src1
1796 } else if (op == loadConstOp) {
1797 // dest is a temporary
1798 // src1 is an immediate value
1799 // dest = src1:imm32
1800 emitMovImmToRM(EBP, -(dest*4), src1, insn);
1802 } else if (op == loadOp) {
1803 // dest is a temporary
1804 // src1 is the address of the operand
1806 emitMovMToReg(EAX, src1, insn); // mov eax, src1
1807 emitMovRegToRM(EBP, -(dest*4), EAX, insn); // mov -(dest*4)[ebp], eax
1809 } else if (op == loadIndirOp) {
1810 // same as loadOp, but the value to load is already in a register
1811 emitMovRMToReg(EAX, EBP, -(src1*4), insn); // mov eax, -(src1*4)[ebp]
1812 emitMovRMToReg(EAX, EAX, 0, insn); // mov eax, [eax]
1813 emitMovRegToRM(EBP, -(dest*4), EAX, insn); // mov -(dest*4)[ebp], eax
1815 } else if (op == storeOp) {
1817 // dest has the address where src1 is to be stored
1818 // src1 is a temporary
1819 // src2 is a "scratch" register, we don't need it in this architecture
1820 emitMovRMToReg(EAX, EBP, -(src1*4), insn); // mov eax, -(src1*4)[ebp]
1821 emitMovRegToM(dest, EAX, insn); // mov dest, eax
1823 } else if (op == storeIndirOp) {
1824 // same as storeOp, but the address where to store is already in a
1826 emitMovRMToReg(EAX, EBP, -(src1*4), insn); // mov eax, -(src1*4)[ebp]
1827 emitMovRMToReg(ECX, EBP, -(dest*4), insn); // mov ecx, -(dest*4)[ebp]
1828 emitMovRegToRM(ECX, 0, EAX, insn); // mov [ecx], eax
1830 } else if (op == ifOp) {
1831 // if src1 == 0 jump to dest
1832 // src1 is a temporary
1833 // dest is a target address
1834 emitOpRegReg(0x29, EAX, EAX, insn); // sub EAX, EAX ; clear EAX
1835 emitOpRegRM(0x3B, EAX, EBP, -(src1*4), insn); // cmp -(src1*4)[EBP], EAX
1839 *((int *)insn) = dest;
1840 insn += sizeof(int);
1844 } else if (op == branchOp) {
1845 emitJump(dest - JUMP_REL32_SZ, insn);
1846 base += JUMP_REL32_SZ;
1847 return(base - JUMP_REL32_SZ);
1848 } else if (op == trampPreamble) {
1849 // no code is needed here
1851 } else if (op == trampTrailer) {
1853 // generate the template for a jump -- actual jump is generated elsewhere
1854 emitJump(0, insn); // jump xxxx
1855 // return the offset of the previous jump
1856 base += insn - first;
1857 return(base - JUMP_REL32_SZ);
1859 } else if (op == noOp) {
1860 emitSimpleInsn(NOP, insn); // nop
1862 } else if (op == getRetValOp) {
1863 // dest is a register were we can store the value
1864 // the return value is in the saved EAX
1865 emitMovRMToReg(EAX, EBP, SAVED_EAX_OFFSET, insn);
1866 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
1867 base += insn - first;
1870 } else if (op == getParamOp) {
1871 // src1 is the number of the argument
1872 // dest is a register were we can store the value
1873 // Parameters are addressed by a positive offset from ebp,
1874 // the first is PARAM_OFFSET[ebp]
1875 emitMovRMToReg(EAX, EBP, PARAM_OFFSET + src1*4, insn);
1876 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
1877 base += insn - first;
1880 } else if (op == saveRegOp) {
1881 // should not be used on this platform
1889 // dest = src1 + src2
1893 opcode = 0x03; // ADD
1897 opcode = 0x2B; // SUB
1901 opcode = 0x0FAF; // IMUL
1905 // dest = src1 div src2
1907 // cdq ; edx = sign extend of eax
1908 // idiv eax, src2 ; eax = edx:eax div src2, edx = edx:eax mod src2
1910 emitMovRMToReg(EAX, EBP, -(src1*4), insn);
1911 emitSimpleInsn(0x99, insn);
1912 emitOpRegRM(0xF7, 0x7 /*opcode extension*/, EBP, -(src2*4), insn);
1913 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
1920 opcode = 0x0B; // OR
1924 opcode = 0x23; // AND
1928 // dest = src1 relop src2
1935 emitRelOp(op, dest, src1, src2, insn);
1945 emitMovRMToReg(EAX, EBP, -(src1*4), insn);
1946 emitOpRegRM(opcode, EAX, EBP, -(src2*4), insn);
1947 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
1949 base += insn - first;
1954 unsigned emitImm(opCode op, reg src1, int src2imm, reg dest, char *ibuf, unsigned &base, bool)
1956 unsigned char *insn = (unsigned char *) (&ibuf[base]);
1957 unsigned char *first = insn;
1959 if (op == storeOp) {
1961 // dest has the address where src1 is to be stored
1962 // src1 is an immediate value
1963 // src2 is a "scratch" register, we don't need it in this architecture
1964 emitMovImmToReg(EAX, dest, insn);
1965 emitMovImmToRM(EAX, 0, src1, insn);
1973 opcode2 = 0x0; // ADD
1978 opcode2 = 0x5; // SUB
1983 if (isPowerOf2(src2imm, result) && result <= MAX_IMM8) {
1985 emitMovRMToReg(EAX, EBP, -(src1*4), insn);
1986 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
1989 emitOpRMImm8(0xC1, 4, EBP, -(dest*4), result, insn);
1992 // imul EAX, -(src1*4)[ebp], src2imm
1993 emitOpRegRMImm(0x69, EAX, EBP, -(src1*4), src2imm, insn);
1994 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
2003 if (isPowerOf2(src2imm, result) && result <= MAX_IMM8) {
2005 emitMovRMToReg(EAX, EBP, -(src1*4), insn);
2006 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
2009 emitOpRMImm8(0xC1, 7, EBP, -(dest*4), result, insn);
2012 // dest = src1 div src2imm
2014 // cdq ; edx = sign extend of eax
2016 // idiv eax, ebx ; eax = edx:eax div src2, edx = edx:eax mod src2
2018 emitMovRMToReg(EAX, EBP, -(src1*4), insn);
2019 emitSimpleInsn(0x99, insn);
2020 emitMovImmToReg(EBX, src2imm, insn);
2022 emitOpRegReg(0xF7, 0x7 /*opcode extension*/, EBX, insn);
2023 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
2033 opcode2 = 0x1; // OR
2038 opcode2 = 0x4; // AND
2042 // dest = src1 relop src2
2049 emitRelOpImm(op, dest, src1, src2imm, insn);
2059 emitMovRMToReg(EAX, EBP, -(src1*4), insn);
2060 emitMovRegToRM(EBP, -(dest*4), EAX, insn);
2062 emitOpRMImm(opcode1, opcode2, EBP, -(dest*4), src2imm, insn);
2064 base += insn - first;
2070 int getInsnCost(opCode op)
2072 if (op == loadConstOp) {
2074 } else if (op == loadOp) {
2076 } else if (op == loadIndirOp) {
2078 } else if (op == storeOp) {
2080 } else if (op == storeIndirOp) {
2082 } else if (op == ifOp) {
2084 } else if (op == branchOp) {
2085 return(1); /* XXX Need to find out what value this should be. */
2086 } else if (op == callOp) {
2087 // cost of call only
2089 } else if (op == updateCostOp) {
2091 } else if (op == trampPreamble) {
2093 } else if (op == trampTrailer) {
2095 } else if (op == noOp) {
2097 } else if (op == getRetValOp) {
2099 } else if (op == getParamOp) {
2110 return(1+1+2+1+1+1);
2131 bool process::heapIsOk(const vector<sym_data> &find_us) {
2136 // find the main function
2137 // first look for main or _main
2138 #if !defined(i386_unknown_nt4_0)
2139 if (!((mainFunction = findOneFunction("main"))
2140 || (mainFunction = findOneFunction("_main")))) {
2141 string msg = "Cannot find main. Exiting.";
2142 statusLine(msg.string_of());
2143 showErrorCallback(50, msg);
2147 if (!((mainFunction = findOneFunction("main"))
2148 || (mainFunction = findOneFunction("_main"))
2149 || (mainFunction = findOneFunction("WinMain"))
2150 || (mainFunction = findOneFunction("_WinMain")))) {
2151 string msg = "Cannot find main or WinMain. Exiting.";
2152 statusLine(msg.string_of());
2153 showErrorCallback(50, msg);
2158 for (unsigned i=0; i<find_us.size(); i++) {
2159 str = find_us[i].name;
2160 if (!getSymbolInfo(str, sym, baseAddr)) {
2161 string str1 = string("_") + str.string_of();
2162 if (!getSymbolInfo(str1, sym, baseAddr) && find_us[i].must_find) {
2164 msg = string("Cannot find ") + str + string(". Exiting");
2165 statusLine(msg.string_of());
2166 showErrorCallback(50, msg);
2172 // string ghb = GLOBAL_HEAP_BASE;
2173 // if (!getSymbolInfo(ghb, sym, baseAddr)) {
2174 // ghb = U_GLOBAL_HEAP_BASE;
2175 // if (!getSymbolInfo(ghb, symm baseAddr)) {
2177 // msg = string("Cannot find ") + str + string(". Exiting");
2178 // statusLine(msg.string_of());
2179 // showErrorCallback(50, msg);
2183 // Address instHeapEnd = sym.addr()+baseAddr;
2184 // addInternalSymbol(ghb, instHeapEnd);
2186 string ghb = INFERIOR_HEAP_BASE;
2187 if (!getSymbolInfo(ghb, sym, baseAddr)) {
2188 ghb = UINFERIOR_HEAP_BASE;
2189 if (!getSymbolInfo(ghb, sym, baseAddr)) {
2191 msg = string("Cannot find ") + ghb + string(". Cannot use this application");
2192 statusLine(msg.string_of());
2193 showErrorCallback(50, msg);
2197 Address curr = sym.addr()+baseAddr;
2199 #if !defined(i386_unknown_nt4_0)
2200 string tt = "DYNINSTtrampTable";
2201 if (!getSymbolInfo(tt, sym, baseAddr)) {
2203 msg = string("Cannot find ") + tt + string(". Cannot use this application");
2204 statusLine(msg.string_of());
2205 showErrorCallback(50, msg);
2210 // Check that we can patch up user code to jump to our base trampolines:
2211 const Address instHeapStart = curr;
2212 const Address instHeapEnd = instHeapStart + SYN_INST_BUF_SIZE - 1;
2214 if (instHeapEnd > getMaxBranch()) {
2215 logLine("*** FATAL ERROR: Program text + data too big for dyninst\n");
2216 sprintf(errorLine, " heap starts at %x and ends at %x\n",
2217 instHeapStart, instHeapEnd);
2227 dictionary_hash<string, unsigned> funcFrequencyTable(string::hash);
2230 // initDefaultPointFrequencyTable - define the expected call frequency of
2231 // procedures. Currently we just define several one shots with a
2232 // frequency of one, and provide a hook to read a file with more accurate
2235 void initDefaultPointFrequencyTable()
2241 funcFrequencyTable["main"] = 1;
2242 funcFrequencyTable["DYNINSTsampleValues"] = 1;
2243 funcFrequencyTable[EXIT_NAME] = 1;
2245 // try to read file.
2246 fp = fopen("freq.input", "r");
2250 printf("found freq.input file\n");
2253 fscanf(fp, "%s %f\n", name, &value);
2254 funcFrequencyTable[name] = (int) value;
2255 printf("adding %s %f\n", name, value);
2261 * Get an etimate of the frequency for the passed instPoint.
2262 * This is not (always) the same as the function that contains the point.
2264 * The function is selected as follows:
2266 * If the point is an entry or an exit return the function name.
2267 * If the point is a call and the callee can be determined, return the called
2269 * else return the funcation containing the point.
2271 * WARNING: This code contins arbitray values for func frequency (both user
2272 * and system). This should be refined over time.
2274 * Using 1000 calls sec to be one SD from the mean for most FPSPEC apps.
2278 float getPointFrequency(instPoint *point)
2281 pd_Function *func = point->callee();
2284 func = point->func();
2286 if (!funcFrequencyTable.defines(func->prettyName())) {
2287 if (func->isLibTag()) {
2290 // Changing this value from 250 to 100 because predictedCost was
2291 // too high - naim 07/18/96
2295 return ((float)funcFrequencyTable[func->prettyName()]);
2300 // return cost in cycles of executing at this point. This is the cost
2301 // of the base tramp if it is the first at this point or 0 otherwise.
2303 int getPointCost(process *proc, const instPoint *point)
2305 instPoint *temp = (instPoint *)point;
2306 temp->checkInstructions();
2308 if (proc->baseMap.defines(point)) {
2311 if (point->size() == 1)
2312 return 9000; // estimated number of cycles for traps
2320 bool returnInstance::checkReturnInstance(const vector<Address> &, u_int &) {
2324 void returnInstance::installReturnInstance(process *proc) {
2325 assert(instructionSeq);
2326 proc->writeTextSpace((void *)addr_, instSeqSize, instructionSeq->ptr());
2327 delete instructionSeq;
2331 void returnInstance::addToReturnWaitingList(Address , process *) {
2335 void generateBreakPoint(instruction &insn) {
2339 void instWaitingList::cleanUp(process *, Address ) {
2343 /* ***************************************************** */
2345 bool process::emitInferiorRPCheader(void *void_insnPtr, unsigned &base) {
2346 unsigned char *insnPtr = (unsigned char *)void_insnPtr;
2347 unsigned char *origInsnPtr = insnPtr;
2350 // We emit the following here (to set up a fresh stack frame):
2351 // pushl %ebp (0x55)
2352 // movl %esp, %ebp (0x89 0xe5)
2356 emitSimpleInsn(PUSH_EBP, insnPtr);
2357 emitMovRegToReg(EBP, ESP, insnPtr);
2358 // allocate space for temporaries (virtual registers)
2359 emitOpRegImm(5, ESP, 128, insnPtr); // sub esp, 128
2360 emitSimpleInsn(PUSHAD, insnPtr);
2361 emitSimpleInsn(PUSHFD, insnPtr);
2363 base += (insnPtr - origInsnPtr);
2368 bool process::emitInferiorRPCtrailer(void *void_insnPtr, unsigned &base,
2369 unsigned &breakOffset,
2370 bool shouldStopForResult,
2371 unsigned &stopForResultOffset,
2372 unsigned &justAfter_stopForResultOffset) {
2373 unsigned char *insnPtr = (unsigned char *)void_insnPtr;
2374 // unsigned char * is the most natural to work with on x86, since instructions
2375 // are always an integral # of bytes. Besides, it makes the following line easy:
2376 insnPtr += base; // start off in the right spot
2378 if (shouldStopForResult) {
2379 // illegal insn: 0x0f0b does the trick.
2380 stopForResultOffset = base;
2385 justAfter_stopForResultOffset = base;
2388 // Sequence: popfd, popad, leave (0xc9), call DYNINSTbreakPoint(), illegal
2390 emitSimpleInsn(POPFD, insnPtr); // popfd
2391 emitSimpleInsn(POPAD, insnPtr); // popad
2392 emitSimpleInsn(LEAVE, insnPtr); // leave
2393 base += 3; // all simple insns are 1 byte
2395 // We can't do a SIGTRAP since SIGTRAP is reserved in x86.
2396 // So we do a SIGILL instead.
2402 // Here, we should generate an illegal insn, or something.
2403 // A two-byte insn, 0x0f0b, should do the trick. The idea is that
2404 // the code should never be executed.
2412 // process::replaceFunctionCall
2414 // Replace the function call at the given instrumentation point with a call to
2415 // a different function, or with a NOOP. In order to replace the call with a
2416 // NOOP, pass NULL as the parameter "func."
2417 // Returns true if sucessful, false if not. Fails if the site is not a call
2418 // site, or if the site has already been instrumented using a base tramp.
2420 // Note that right now we can only replace a call instruction that is five
2421 // bytes long (like a call to a 32-bit relative address).
2422 bool process::replaceFunctionCall(const instPoint *point,
2423 const function_base *func) {
2424 // Must be a call site
2425 if (!point->insnAtPoint().isCall())
2428 // Cannot already be instrumented with a base tramp
2429 if (baseMap.defines(point))
2433 if (func == NULL) { // Replace with NOOPs
2434 unsigned char *newInsn = new unsigned char[point->insnAtPoint().size()];
2435 unsigned char *p = newInsn;
2436 for (int i = 0; i < point->insnAtPoint().size(); i++)
2437 emitSimpleInsn(NOP, p);
2438 writeTextSpace((void *)point->iPgetAddress(),
2439 point->insnAtPoint().size(), newInsn);
2440 } else { // Replace with a call to a different function
2441 // XXX Right only, call has to be 5 bytes -- sometime, we should make
2442 // it work for other calls as well.
2443 assert(point->insnAtPoint().size() == CALL_REL32_SZ);
2444 unsigned char *newInsn = new unsigned char[CALL_REL32_SZ];
2445 unsigned char *p = newInsn;
2446 emitCallRel32(func->addr() - (point->iPgetAddress()+CALL_REL32_SZ), p);
2447 writeTextSpace((void *)point->iPgetAddress(), CALL_REL32_SZ, newInsn);